Skip to main content

Table 1 Chemical structure sets used to measure performance

From: Efficient ring perception for the Chemistry Development Kit

Identifier

n structures

Description

Available

chebi_108

26,790

ChEBI Release 108 [15]

http://www.ebi.ac.uk/chebi

nci_aug00

250,172

NCI Aug 2000 [16]

http://cactus.nci.nih.gov/download/nci

zinc_frag

504,074

Zinc Clean Fragments

http://zinc.docking.org/subsets/clean-fragments

  

Ph7 2013-04-12 [17]

 

chembl_17

1,318,180

ChEMBL Release 17 [18]

http://www.ebi.ac.uk/chembl

zinc_leads

5,135,179

Zinc Clean Leads Ph7 2013-05-31 [17]

http://zinc.docking.org/subsets/clean-leads

  1. The number of structures is the number which were successfully read from SMILES [19] notation. The ChEBI and ChEMBL datasets had a small number erroneous SMILES string which could not be interpreted.