From: LeadMine: a grammar and dictionary driven approach to entity recognition
Dictionary | Number of terms | Example | Construction methodology |
---|---|---|---|
Alloy | 206 | pig iron | Manually constructed |
Allotrope | 72 | red phosphorus | Manually constructed |
Common chemical abbreviation | 224 | TMEDA | Manually constructed |
Common trivial chemical name | 574 | adam's catalyst | Manually constructed |
Drug name | 11397 | vancomycin | Manually constructed |
Element | 227 | protactinium | Manually constructed |
Generic chemical class | 2254 | quaternary amine | Manually constructed |
Generic chemical class from ChEBI | 3917 | keto steroids | Derived from ChEBI [10] terms that are referenced in an "is a" relationship by another term |
Mineral | 5100 | paragonite | International Mineralogical Association names with some manual additions |
Non-structural chemical classes | 18 | bronsted lowry acid | Manually constructed. Terms that do not strictly convey structural information but were nonetheless annotated in the corpus |
One heavy atom substituent | 11 | methyl | Manually constructed |
Polymer | 531 | polyethylene glycol 8000 | Manually constructed. Biochemical polymers blocked by another dictionary |
Wikipedia | 171 | beefy meaty peptide | Terms from Wikipedia chemboxes not matched by our other dictionaries |