Skip to main content

Table 14 Distribution (according to chemical subtype) of the instances incorrectly rejected by the model trained with enriched features.

From: Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics

Subtype Frequency Percentage
Abbreviation 1,882 30.32%
Formula 1,291 20.80%
Family 979 15.77%
Trivial 926 14.92%
Systematic 693 11.16%
Identifier 293 4.72%
Multiple 118 1.90%
No class 25 0.40%