From: GEN: highly efficient SMILES explorer using autodidactic generative examination networks
Dataset and evaluated size | Augmented size with real factora | Best model epoch # | Validity% | Uniqueness% | Training% | Length match%b | HAC match%c |
---|---|---|---|---|---|---|---|
PubChem225k | |||||||
9k | 54,624 (4.8) | 10, 10, 10 | 81.3 ± 0.9 | 100.0 ± 0.0 | 0.3 ± 0.1 | 97.7 ± 0.0 | 90.5 ± 0.0 |
45k | 218,124 (4.8) | 5, 5, 5 | 95.6 ± 0.7 | 99.9 ± 0.1 | 2.6 ± 0.5 | 99.0 ± 0.0 | 94.7 ± 0.0 |
225k | 1088,864 (4.8) | 4, 4, 4 | 98.3 ± 0.3 | 99.9 ± 0.0 | 11.2 ± 0.5 | 97.3 ± 0.7 | 96.6 ± 0.3 |
Chembl24 | |||||||
9k | 35,928 (4.0) | 44, 43, 45 | 74.2 ± 1.9 | 99.0 ± 0.2 | 0.2 ± 0.2 | 81.9 ± 5.4 | 95.9 ± 1.0 |
45k | 179,888 (4.0) | 5, 6, 5 | 91.9 ± 1.9 | 100.0 ± 0.0 | 0.2 ± 0.1 | 90.6 ± 2.8 | 97.6 ± 1.4 |
225k | 896,214 (4.0) | 9, 6, 6 | 94.6 ± 0.1 | 100.0 ± 0.0 | 1.4 ± 0.3 | 88.4 ± 1.6 | 98.1 ± 0.6 |
Zinc15 | |||||||
9k | 32,546 (3.6) | 24, 21, 21 | 77.2 ± 1.0 | 100.0 ± 0.0 | 0.0 ± 0.0 | 82.2 ± 3.3 | 91.2 ± 1.1 |
45k | 163,929 (3.6) | 10, 7, 11 | 90.4 ± 1.1 | 100.0 ± 0.0 | 0.1 ± 0.1 | 87.6 ± 1.2 | 92.6 ± 1.1 |
225k | 820,747 (3.6) | 4, 6, 6 | 95.2 ± 0.3 | 100.0 ± 0.0 | 0.3 ± 0.1 | 90.4 ± 1.2 | 93.5 ± 1.2 |