Skip to main content

Table 2 Checker total number of the different penalty scores output from subjecting the ChEMBL Literature set, the SureChEMBL set and the PubChem Set to the Checker process

From: An open source chemical structure curation pipeline using RDKit

Penalty score

Penalty explanation

SureChEMBL

ChEMBL Literature

PubChem

7

Error-9986 (Cannot process aromatic bonds)

4

0

0

Illegal input

0

1

0

InChI: Unknown element(s)

3

0

1355

6

All atoms have zero coordinates

0

0

12

InChI: Accepted unusual valence(s)

73

1

2155

InChI: Empty structure

0

1

5824

Molecule has 3D coordinates

0

1

1024

Molecule has a radical that is not found in the known list

187

1

252

Molecule has six (or more) atoms with exactly the same coordinates

3

0

206

Number of atoms less than 1

0

1

5824

Polymer information in mol file

2

0

0

V3000 mol file

594

0

0

5

InChI_RDKit/Mol stereo mismatch

588

152

339

Mol/Inchi/RDKit stereo mismatch

0

0

28

RDKit_Mol/InChI stereo mismatch

23

22

1479

Molecule has a bond with an illegal stereo flag

1054

0

0

Molecule has a bond with an illegal type

6

0

0

Molecule has a crossed bond in a ring

34

36

134

Molecule has two (or more) atoms with exactly the same coordinates

4

5

2367

2

InChI_Mol/RDKit stereo mismatch

0

55

307

Molecule has a stereo bond in a ring

2359

5763

7061

Molecule has an atom with multiple stereo bonds

1493

52

3660

Molecule has a stereo bond to a stereocenter

331

27

983

Molecule has the 3D flag set for a 2D conformer

0

0

5

Other InChI Warnings

20188

34052

170678

 

No errors

15015

111137

177815

  1. Note that the number of penalty scores output is not the same as the number of compounds as some compounds return multiple penalty scores