Skip to main content

Table 3 Metrics obtained from a 50,000 SMILES sample of all the models trained

From: A de novo molecular generation method using latent vector based generative adversarial network

Dataset

Arch.

Valid (%)

Unique (%)

Novel (%)

Active (%)

Recovered actives/total actives (%)

Recovered neighbors

EGFR

GAN

86

56

97

71

5.26

196

RNN

96

46

95

65

7.74

238

HTR1A

GAN

86

66

95

71

5.05

284

RNN

96

50

90

81

7.28

384

S1PR1

GAN

89

31

98

44

0.93

24

RNN

97

35

97

65

3.72

43

  1. Dataset used (Dataset), Architecture used (Arch.), Percent of valid molecules in the sampled set (Valid), Percent of valid unique compounds (Unique), Percent of unique novel (not present in the training set) compounds (Novel), Percent of unique active compounds (Active), Recovered actives from the test set given the entire number of actives in the test set (Recovered actives/Total Actives), Recovered neighbors of active compounds using FCFP6 fingerprint with 2048 bits and a threshold Tanimoto similarity of 0.7