Skip to main content

Advertisement

Table 2 Chemical text corpora for evaluating and training the NER applications

From: Chemical named entities recognition: a review on approaches and applications

Corpus Class of named entities Reference Availability
IUPAC training corpus IUPAC names [2] http://www.scai.fraunhofer.de/chem-corpora.html
SCAI All chemical names [17] http://www.scai.fraunhofer.de/chem-corpora.html
PubMed corpus Compounds, reagents, chemical adjectives enzymes and prefix [18] Not available.
Sciborg corpus All chemical names [18] Not available
GENIA corpus Biological besides some chemical entities [19] http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA
European Patent Office and the ChEB All chemical names [20] http://chebi.cvs.sourceforge.net/viewvc/chebi/chapati/patentsGoldStandard
CHEMDNER Corpus Chemical compounds and drugs [21] http://www.biocreative.org/tasks/biocreative-iv/chemdner/