Skip to main content

Table 2 Chemical text corpora for evaluating and training the NER applications

From: Chemical named entities recognition: a review on approaches and applications

Corpus

Class of named entities

Reference

Availability

IUPAC training corpus

IUPAC names

[2]

http://www.scai.fraunhofer.de/chem-corpora.html

SCAI

All chemical names

[17]

http://www.scai.fraunhofer.de/chem-corpora.html

PubMed corpus

Compounds, reagents, chemical adjectives enzymes and prefix

[18]

Not available.

Sciborg corpus

All chemical names

[18]

Not available

GENIA corpus

Biological besides some chemical entities

[19]

http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA

European Patent Office and the ChEB

All chemical names

[20]

http://chebi.cvs.sourceforge.net/viewvc/chebi/chapati/patentsGoldStandard

CHEMDNER Corpus

Chemical compounds and drugs

[21]

http://www.biocreative.org/tasks/biocreative-iv/chemdner/