Skip to main content

Table 2 Comparison of several molecular generators in the GuacaMol [33] distribution learning benchmark

From: Molecular generation by Fast Assembly of (Deep)SMILES fragments

Benchmark

Random sampler

SMILES LSTM

Graph MCTS

AAE

ORGAN

VAE

FASMIFRA

Negative control

Validity

1.000

0.959

1.000

0.822

0.379

0.870

1.000

1.000

Uniqueness

0.997

1.000

1.000

1.000

0.841

0.999

0.994

0.959

Novelty

0.000

0.912

0.994

0.998

0.687

0.974

0.702

0.947

KL_divergence

0.998

0.991

0.522

0.886

0.267

0.982

0.959

0.855

FCD

0.929

0.913

0.015

0.529

0.000

0.863

0.814

0.397

  1. Random sampler: baseline model; SMILES LSTM: Long-Short-Term Memory DNN for SMILES strings; Graph MCTS: Graph-based Monte Carlo Tree Search; AAE: Adversarial AutoEncoder; ORGAN: Objective-Reinforced Generative Adversarial Network; VAE: Variational AutoEncoder; FASMIFRA: Fast Assembly of SMILES Fragments (proposed method); Negative control: FASMIFRA without extended bond typing (any fragment can be connected to any other fragment)