Skip to main content

Table 4 Some comparison of DrugBank and ChEMBL datasets

From: The Chemical Validation and Standardization Platform (CVSP): large-scale automated validation of chemical structure datasets

 

DrugBank

ChEMBL

Examples

Errors

 Query bonds

2

0

DB00115

Stereocenters: stereotypes of non-opposite bonds match

1

292

DB08128, CHEMBL1183153, CHEMBL1971333

 Stereocenters: stereotypes of opposite bonds mismatch

2

2542

DB00877, CHEMBL1237110

 Stereocenters: one bond up, one down

1

182

DB01590, CHEMBL552998, CHEMBL1237113

 Stereocenters: implicit hydrogen near stereocenter

1

1

DB00910, CHEMBL2314995

 Non-unique dearomatization

57

0

DB01705

 Unknown atom symbol (“A”, “*” - polymers)

3

0

DB01344

 Bad Valence (Indigo)

1

0

DB01747

 InChI generation failed

4

2

DB03846, CHEMBL1770360

Warnings

 InChI does not match structure

36

N/A

DB00162

 Name does not match structure

24

N/A

DB08346

 SMILES does not match structure

48

N/A

DB00520

 Contains only multiple instances of same molecule

0

25

CHEMBL607305

 Not a neutral system

314

14337

DB00118, CHEMBL13045

 Angle between bonds too small

2

164

DB00362, CHEMBL59973

 Free carbon monoxide

0

5

CHEMBL108869

 Unusual valence

49

119

DB01703, DB03492, CHEMBL2028143, CHEMBL2028140

 Relative stereo (wedge or hash bonds but no chiral flag in molfile)

1183

151203

DB00140, CHEMBL1801886

 More than one radical atom

2

4

DB04119, CHEMBL606910

Information

 Contains enol function

64

11898

DB00554, CHEMBL62289

 Stereobond in ring

4

943

DB00877, CHEMBL1864961, CHEMBL1864961

 Contain unknown stereobond

32

23451

DB00162, CHEMBL1866933

 Contain metal-nitrogen bond

25

60

DB02003, CHEMBL611725

 Contain partially undefined stereo

24

26862

DB00462, CHEMBL63248

 Strongest acid not ionized first

3

164

DB04798, CHEMBL8056

 Contains L-pyranose

185

5887

DB00199, CHEMBL66563

 Contains metal-oxygen bond

32

 

DB00526, CHEMBL611725