Skip to main content

Table 4 The top 10 entities with highest FN for each chemical NER tool on the four different corpora

From: Recognizing chemicals in patents: a comparative analysis

CEMP_T

CEMP_D

ChemSpot

tmChem

ChemSpot

tmChem

H 227

Alkyl 226

H 233

Alkyl 246

Aryl 170

Aryl 179

Aryl 174

Aryl 183

C1-6 alkyl 115

H 173

Heterocyclic 133

H 179

Heteroaryl 82

C1-6 alkyl 121

Heteroaryl 87

Heterocyclic 135

Alkyl 74

S 86

N 76

S 86

N 71

Cyano 85

C1-6 alkyl 69

C1-6 alkyl 76

Alkoxy 67

Heterocyclic 62

Alkoxy 63

Cyano 71

Cyano 62

Halo 55

Alkyl 59

N 56

Heterocyclic 61

Oligonucleotides 50

Aromatic 51

Halo 52

Halo 51

Opioid 50

Cyano 45

Aromatic 51

Chapati

BioS

ChemSpot

tmChem

ChemSpot

tmChem

Drug 234

Water 264

Alkyl 5295

Alkyl 8145

Ci-I0 alkyl 160

Drug 234

Aryl 4698

Water 7142

NR 139

Peptide 205

DMSO 3184

Aryl 5426

Insulin 107

Ci-I0 alkyl 160

Heteroaryl 2435

Ph 1995

Aptamer 92

NR 139

Alkoxy 1993

H 1921

Polypeptide 88

Insulin 107

H 1869

Brine 1822

SO2R 65

CN 94

Brine 1777

DMSO 1490

NH–CO–R 63

Aptamer 92

Inhibitors 1468

Ethyl acetate 1473

SO\(_2\)–NR 63

Polypeptide 89

Substituted 1447

Inhibitors 1472

NH–SO\(_2\)–R 62

SO2R 65

Lower alkyl 1422

Substituted 1447

  1. Common mistakes are shown in italic