Skip to main content

Table 3 Error analysis of a random sample of max 25 false negatives from each class for ChemSpider (CS) and Chemlist (CL).

From: Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining

Error type

TRIV

SUM

IUPAC

FAM

ABB

 

CS

CL

CS

CL

CS

CL

CS

CL

CS

CL

Partial match

0

3

0

0

0

0

0

0

0

0

Annotation error

0

2

0

0

0

1

0

0

0

0

Not in dictionary

25

15

22

16

25

24

25

24

25

8

Removed by disambiguation

0

5

0

7

0

0

0

1

0

12

Removed by manual check of highly frequent terms

0

0

0

1

0

0

0

0

0

2

Tokenization error

0

0

3

1

0

0

0

0

0

3