Skip to main content

Table 4 The top 10 entities with highest FN for each chemical NER tool on the four different corpora

From: Recognizing chemicals in patents: a comparative analysis

CEMP_T CEMP_D
ChemSpot tmChem ChemSpot tmChem
H 227 Alkyl 226 H 233 Alkyl 246
Aryl 170 Aryl 179 Aryl 174 Aryl 183
C1-6 alkyl 115 H 173 Heterocyclic 133 H 179
Heteroaryl 82 C1-6 alkyl 121 Heteroaryl 87 Heterocyclic 135
Alkyl 74 S 86 N 76 S 86
N 71 Cyano 85 C1-6 alkyl 69 C1-6 alkyl 76
Alkoxy 67 Heterocyclic 62 Alkoxy 63 Cyano 71
Cyano 62 Halo 55 Alkyl 59 N 56
Heterocyclic 61 Oligonucleotides 50 Aromatic 51 Halo 52
Halo 51 Opioid 50 Cyano 45 Aromatic 51
Chapati BioS
ChemSpot tmChem ChemSpot tmChem
Drug 234 Water 264 Alkyl 5295 Alkyl 8145
Ci-I0 alkyl 160 Drug 234 Aryl 4698 Water 7142
NR 139 Peptide 205 DMSO 3184 Aryl 5426
Insulin 107 Ci-I0 alkyl 160 Heteroaryl 2435 Ph 1995
Aptamer 92 NR 139 Alkoxy 1993 H 1921
Polypeptide 88 Insulin 107 H 1869 Brine 1822
SO2R 65 CN 94 Brine 1777 DMSO 1490
NH–CO–R 63 Aptamer 92 Inhibitors 1468 Ethyl acetate 1473
SO\(_2\)–NR 63 Polypeptide 89 Substituted 1447 Inhibitors 1472
NH–SO\(_2\)–R 62 SO2R 65 Lower alkyl 1422 Substituted 1447
  1. Common mistakes are shown in italic