Skip to main content

Table 5 Percentage match measured as continuous Tanimoto (Tan; Eq. 1) or Jensen-Shannon Divergence (JSD; Eq. 2) between the distributions of the training space and generated compounds at early (10k) and late stage (2M) generation

From: GEN: highly efficient SMILES explorer using autodidactic generative examination networks

 Tan 10kTan 2MJSD 10kJSD 2M
Size:
 SMILES length94.1 ± 0.484.6 ± 0.10.170 ± 0.0040.252 ± 0.000
 Heavy atom count (HAC)98.8 ± 0.294.1 ± 0.10.058 ± 0.0040.142 ± 0.000
 Molecular Weight (MW)97.4 ± 0.292.7 ± 0.10.124 ± 0.0020.187 ± 0.000
Polarity:
 logP99.6 ± 0.099.1 ± 0.00.042 ± 0.0020.055 ± 0.001
 TPSA99.6 ± 0.195.7 ± 0.10.044 ± 0.0010.097 ± 0.000
Topology:
 Rotatable bond count99.5 ± 0.196.5 ± 0.00.042 ± 0.0020.099 ± 0.001
 Fraction cyclic99.2 ± 0.295.6 ± 0.10.051 ± 0.0020.106 ± 0.000
 Fraction conjugated99.6 ± 0.199.7 ± 0.10.047 ± 0.0030.084 ± 0.000
 Fraction aromatic99.7 ± 0.199.5 ± 0.10.060 ± 0.0020.109 ± 0.001
Composition:
 Fraction carbon98.6 ± 0.297.0 ± 0.00.061 ± 0.0030.106 ± 0.000
 Fraction nitrogen99.6 ± 0.296.1 ± 0.10.097 ± 0.0040.132 ± 0.000
 Fraction oxygen99.4 ± 0.199.4 ± 0.10.050 ± 0.0030.058 ± 0.001