Table 2 Chemical text corpora for evaluating and training the NER applications

Corpus Class of named entities Reference Availability
IUPAC training corpus IUPAC names [2]
SCAI All chemical names [17]
PubMed corpus Compounds, reagents, chemical adjectives enzymes and prefix [18] Not available.
Sciborg corpus All chemical names [18] Not available
GENIA corpus Biological besides some chemical entities [19]
European Patent Office and the ChEB All chemical names [20]
CHEMDNER Corpus Chemical compounds and drugs [21]