Skip to main content

Table 7 Reject and accept rules consecution for n-grams (n ≥ 3)

From: Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine

Description

Examples

GeneralChemTermRule (accept rule) (the same rule as for 2-grams)

StrictFilteringTagRule (reject trule) (the same rule as for 2-grams)

ShortTokensRule (reject rule) (the same rule as for 2-grams)

IdenticalTokensRule (reject rule) (the same rule as for 2-grams)

UnitsRule (reject rule) (the same rule as for 2-grams)

ManyGramPOSRule (accept rule with exception)

True, if the fist token must be tagged with one of the following POS tags (noun, gerund, adjective, adverb or participle):

NN, NNP, VBG, VBD, VBN, JJ, JJR, RB, RBS, FW

and the middle in any position token

(+ preposition or determiner):

NN, NNP, VBG, VBD, VBN, JJ, JJR, RB, RBS, FW + IN, DT

and the last token:

VBG,NN,NNP,NNPS,NNS (gerund or noun)

Exception—the following combinations are not allowed (describing phrases which looks like to be torn from their context):

«first token:VBG ->second token NN or IN» ,

«first token:VBN ->second token NN or:JJ»

Term-like: X-ray fluorescence spectrometer; Brønsted basic site; Pd(110) surface oscillation; doping CsPW with platinum; catalyzed N2O decomposition; crystalline phase transition; catalyzed oxidation of NO; complete photoreduction of Pd(II); propagating thermosynthesis; reforming of the biomass; drying inside the microscope column

Filtered due to exception:used during steam reforming; catalyzed by metalloporphyrin; investigated by XRD; using atomic absorption