Skip to main content

Table 1 Molecular diversity assessed via the number of unique Bemis-Murcko scaffolds at the intersection between datasets

From: Molecular generation by Fast Assembly of (Deep)SMILES fragments

  ChEMBL_train (100k) ChEMBL_gene (100k) TCM_train (20k) TCM_gene (20k)
ChEMBL_train (100k) 25135 8466 (23957 new \(\sim =\) 74%) 979 982
ChEMBL_gene (100k) 8466 (23957 new \(\sim =\) 74%) 32423 802 891
TCM_train (20k) 979 802 6056 2713 (5726 new \(\sim =\) 68%)
TCM_gene (20k) 982 891 2713 (5726 new \(\sim =\) 68%) 8439