Skip to main content

Table 4 Some comparison of DrugBank and ChEMBL datasets

From: The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets

  DrugBank ChEMBL Examples
Errors
 Query bonds 2 0 DB00115
Stereocenters: stereotypes of non-opposite bonds match 1 292 DB08128, CHEMBL1183153, CHEMBL1971333
 Stereocenters: stereotypes of opposite bonds mismatch 2 2542 DB00877, CHEMBL1237110
 Stereocenters: one bond up, one down 1 182 DB01590, CHEMBL552998, CHEMBL1237113
 Stereocenters: implicit hydrogen near stereocenter 1 1 DB00910, CHEMBL2314995
 Non-unique dearomatization 57 0 DB01705
 Unknown atom symbol (“A”, “*” - polymers) 3 0 DB01344
 Bad Valence (Indigo) 1 0 DB01747
 InChI generation failed 4 2 DB03846, CHEMBL1770360
Warnings
 InChI does not match structure 36 N/A DB00162
 Name does not match structure 24 N/A DB08346
 SMILES does not match structure 48 N/A DB00520
 Contains only multiple instances of same molecule 0 25 CHEMBL607305
 Not a neutral system 314 14337 DB00118, CHEMBL13045
 Angle between bonds too small 2 164 DB00362, CHEMBL59973
 Free carbon monoxide 0 5 CHEMBL108869
 Unusual valence 49 119 DB01703, DB03492, CHEMBL2028143, CHEMBL2028140
 Relative stereo (wedge or hash bonds but no chiral flag in molfile) 1183 151203 DB00140, CHEMBL1801886
 More than one radical atom 2 4 DB04119, CHEMBL606910
Information
 Contains enol function 64 11898 DB00554, CHEMBL62289
 Stereobond in ring 4 943 DB00877, CHEMBL1864961, CHEMBL1864961
 Contain unknown stereobond 32 23451 DB00162, CHEMBL1866933
 Contain metal-nitrogen bond 25 60 DB02003, CHEMBL611725
 Contain partially undefined stereo 24 26862 DB00462, CHEMBL63248
 Strongest acid not ionized first 3 164 DB04798, CHEMBL8056
 Contains L-pyranose 185 5887 DB00199, CHEMBL66563
 Contains metal-oxygen bond 32   DB00526, CHEMBL611725