Skip to main content

Table 1 Chemical structure sets used to measure performance

From: Efficient ring perception for the Chemistry Development Kit

Identifier n structures Description Available
chebi_108 26,790 ChEBI Release 108 [15] http://www.ebi.ac.uk/chebi
nci_aug00 250,172 NCI Aug 2000 [16] http://cactus.nci.nih.gov/download/nci
zinc_frag 504,074 Zinc Clean Fragments http://zinc.docking.org/subsets/clean-fragments
   Ph7 2013-04-12 [17]  
chembl_17 1,318,180 ChEMBL Release 17 [18] http://www.ebi.ac.uk/chembl
zinc_leads 5,135,179 Zinc Clean Leads Ph7 2013-05-31 [17] http://zinc.docking.org/subsets/clean-leads
  1. The number of structures is the number which were successfully read from SMILES [19] notation. The ChEBI and ChEMBL datasets had a small number erroneous SMILES string which could not be interpreted.