Skip to main content

Table 3 Percentages of the compounds in each of the SureChEMBL, ChEMBL and PubChem sets returning each value as their maximum penalty score

From: An open source chemical structure curation pipeline using RDKit

Penalty score

SureChEMBL

ChEMBL Literature

PubChem

7

0.01

0.00

0.45

6

1.62

0.00

3.14

5

2.72

0.15

1.00

2 (non InChI)

6.92

3.90

3.12

2 (InChI)

28.77

20.35

32.59

No errors

59.95

75.60

59.70

  1. The highest (most serious) resulting score is the one recorded for each compound