Skip to main content

Table 2 Comparison of several molecular generators in the GuacaMol [33] distribution learning benchmark

From: Molecular generation by Fast Assembly of (Deep)SMILES fragments

Benchmark Random sampler SMILES LSTM Graph MCTS AAE ORGAN VAE FASMIFRA Negative control
Validity 1.000 0.959 1.000 0.822 0.379 0.870 1.000 1.000
Uniqueness 0.997 1.000 1.000 1.000 0.841 0.999 0.994 0.959
Novelty 0.000 0.912 0.994 0.998 0.687 0.974 0.702 0.947
KL_divergence 0.998 0.991 0.522 0.886 0.267 0.982 0.959 0.855
FCD 0.929 0.913 0.015 0.529 0.000 0.863 0.814 0.397
  1. Random sampler: baseline model; SMILES LSTM: Long-Short-Term Memory DNN for SMILES strings; Graph MCTS: Graph-based Monte Carlo Tree Search; AAE: Adversarial AutoEncoder; ORGAN: Objective-Reinforced Generative Adversarial Network; VAE: Variational AutoEncoder; FASMIFRA: Fast Assembly of SMILES Fragments (proposed method); Negative control: FASMIFRA without extended bond typing (any fragment can be connected to any other fragment)