Skip to main content

Table 14 Distribution (according to chemical subtype) of the instances incorrectly rejected by the model trained with enriched features.

From: Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics

Subtype

Frequency

Percentage

Abbreviation

1,882

30.32%

Formula

1,291

20.80%

Family

979

15.77%

Trivial

926

14.92%

Systematic

693

11.16%

Identifier

293

4.72%

Multiple

118

1.90%

No class

25

0.40%