Skip to main content

Table 5 Percentage match measured as continuous Tanimoto (Tan; Eq. 1) or Jensen-Shannon Divergence (JSD; Eq. 2) between the distributions of the training space and generated compounds at early (10k) and late stage (2M) generation

From: GEN: highly efficient SMILES explorer using autodidactic generative examination networks

 

Tan 10k

Tan 2M

JSD 10k

JSD 2M

Size:

 SMILES length

94.1 ± 0.4

84.6 ± 0.1

0.170 ± 0.004

0.252 ± 0.000

 Heavy atom count (HAC)

98.8 ± 0.2

94.1 ± 0.1

0.058 ± 0.004

0.142 ± 0.000

 Molecular Weight (MW)

97.4 ± 0.2

92.7 ± 0.1

0.124 ± 0.002

0.187 ± 0.000

Polarity:

 logP

99.6 ± 0.0

99.1 ± 0.0

0.042 ± 0.002

0.055 ± 0.001

 TPSA

99.6 ± 0.1

95.7 ± 0.1

0.044 ± 0.001

0.097 ± 0.000

Topology:

 Rotatable bond count

99.5 ± 0.1

96.5 ± 0.0

0.042 ± 0.002

0.099 ± 0.001

 Fraction cyclic

99.2 ± 0.2

95.6 ± 0.1

0.051 ± 0.002

0.106 ± 0.000

 Fraction conjugated

99.6 ± 0.1

99.7 ± 0.1

0.047 ± 0.003

0.084 ± 0.000

 Fraction aromatic

99.7 ± 0.1

99.5 ± 0.1

0.060 ± 0.002

0.109 ± 0.001

Composition:

 Fraction carbon

98.6 ± 0.2

97.0 ± 0.0

0.061 ± 0.003

0.106 ± 0.000

 Fraction nitrogen

99.6 ± 0.2

96.1 ± 0.1

0.097 ± 0.004

0.132 ± 0.000

 Fraction oxygen

99.4 ± 0.1

99.4 ± 0.1

0.050 ± 0.003

0.058 ± 0.001