Skip to main content

Table 3 The top 10 entities with highest FP for each chemical NER tool on the four different corpora

From: Recognizing chemicals in patents: a comparative analysis

CEMP_T CEMP_D
ChemSpot tmChem ChemSpot tmChem
Water 951 Sodium 128 Water 842 Sodium 117
Alkyl 260 Sugar 66 Alkyl 194 Nucleotide 74
Sodium 186 CH\(_2\) 56 Sodium 194 Ester 49
DEG 155 Sulfate 43 Peptide 153 Calcium 49
Peptide 107 NO 42 Chitosan 130 O 46
Chitosan 91 Solvate 40 DEG 108 NO 45
Starch 81 Alkyl 39 Parkinson 80 N 44
Calcium 74 Hydrogen 38 Calcium 76 Alkyl 37
Sulfate 66 Calcium 35 Nucleotide 72 Sulfate 37
Parkinson 60 Beta-cyclodextrin 34 Ester 67 Beta-cyclodextrin 36
Chapati BioS
ChemSpot tmChem ChemSpot tmChem
Factor H 121 CO 127 Hydrogen 6246 Hydrogen 6179
Serine 108 Serine 108 1H 5034 Carbon 5518
Alkyl 81 N 88 Carbon 5004 H 3091
Amino acid 66 NH–SO\(_2\) 64 3H 3928 1H 2785
SO\(_2\)–NR<21>R<22 62 NH–CO–R<21 63 Alkyl 3812 3H 2643
CO–R<23 60 Amino acid 61 2H 2946 Nitrogen 2619
NH–CO–R<21 55 Carbon 57 Nitrogen 2878 Silica 1466
Ci-I0 54 Nitroxide 52 Silica 2011 CDCl3 1320
CO–NR<21>R<22 53 C 51 DMSO-d6 1652 2H 1259
Nitroxide 52 H 46 Oxygen 1203 Oxygen 1110
  1. Common mistakes are shown in italic