Skip to main content

Table 4 Augmentation effect on architecture C biLSTM–biLSTM with layer sizes 64/64 and 4 concatenated encoding layers

From: GEN: highly efficient SMILES explorer using autodidactic generative examination networks

Smiles

Augm.

Best model epoch#

Validity%

Uniqueness%

Training%

Length match%a

HAC match%b

Canonical

1

9, 9, 7

96.6 ± 0.5

99.9 ± 0.1

16.2 ± 1.5

93.3 ± 0.3

92.0 ± 0.5

Random

1

10, 14, 16

97.0 ± 0.3

99.9 ± 0.0

11.9 ± 0.6

98.5 ± 0.3

97.4 ± 0.5

Random

2

5, 5, 5

97.3 ± 0.1

99.9 ± 0.0

13.9 ± 0.5

97.7 ± 0.4

94.5 ± 0.8

Random

3

4, 6, 4

97.9 ± 0.3

99.9 ± 0.0

13.6 ± 0.5

98.8 ± 0.1

96.5 ± 0.2

Random

4

4, 3, 4

98.2 ± 0.4

99.9 ± 0.0

11.6 ± 0.5

98.8 ± 0.3

97.1 ± 0.2

Random

5

4, 4, 4

98.3 ± 0.3

99.9 ± 0.0

11.2 ± 0.5

97.3 ± 0.7

96.6 ± 0.3

Random

10

4, 4, 4

98.3 ± 0.3

99.9 ± 0.0

14.2 ± 0.5

98.4 ± 0.4

98.2 ± 0.5

  1. aLength match for SMILES length distributions of the training set and generated set (See “Methods”)
  2. bHAC match for the atom count distributions of the generated set and training set (See “Methods”)