Skip to main content

Table 3 The top 10 entities with highest FP for each chemical NER tool on the four different corpora

From: Recognizing chemicals in patents: a comparative analysis

CEMP_T

CEMP_D

ChemSpot

tmChem

ChemSpot

tmChem

Water 951

Sodium 128

Water 842

Sodium 117

Alkyl 260

Sugar 66

Alkyl 194

Nucleotide 74

Sodium 186

CH\(_2\) 56

Sodium 194

Ester 49

DEG 155

Sulfate 43

Peptide 153

Calcium 49

Peptide 107

NO 42

Chitosan 130

O 46

Chitosan 91

Solvate 40

DEG 108

NO 45

Starch 81

Alkyl 39

Parkinson 80

N 44

Calcium 74

Hydrogen 38

Calcium 76

Alkyl 37

Sulfate 66

Calcium 35

Nucleotide 72

Sulfate 37

Parkinson 60

Beta-cyclodextrin 34

Ester 67

Beta-cyclodextrin 36

Chapati

BioS

ChemSpot

tmChem

ChemSpot

tmChem

Factor H 121

CO 127

Hydrogen 6246

Hydrogen 6179

Serine 108

Serine 108

1H 5034

Carbon 5518

Alkyl 81

N 88

Carbon 5004

H 3091

Amino acid 66

NH–SO\(_2\) 64

3H 3928

1H 2785

SO\(_2\)–NR<21>R<22 62

NH–CO–R<21 63

Alkyl 3812

3H 2643

CO–R<23 60

Amino acid 61

2H 2946

Nitrogen 2619

NH–CO–R<21 55

Carbon 57

Nitrogen 2878

Silica 1466

Ci-I0 54

Nitroxide 52

Silica 2011

CDCl3 1320

CO–NR<21>R<22 53

C 51

DMSO-d6 1652

2H 1259

Nitroxide 52

H 46

Oxygen 1203

Oxygen 1110

  1. Common mistakes are shown in italic