Skip to main content

Table 2 Rules for strict filtering procedure

From: Terminology spectrum analysis of natural-language chemical documents: term-like phrases retrieval routine

No.

Rule

Examples

1

SpecialSymbolsRule

True, if a token contains at least one of the special symbols different from:

. -,/: () [] + = @ ®

SIZE(**), SELECTIVITY%, NIMG_650, H2S↔35SCAT, 1AUDAE_AM, ΔGADS, H0 −8.2

2

StopListRule

True, if a token is in the stop list (Table 1)

LITERATURE, VIEWPOINT, PERCENT, PRESENT, IMPORTANCE, FUNDAMENTAL, CONCLUSION, TYPICALLY, EXAMPLE, INTRODUCTION

Rules of regular expressions:

True, if a token satisfies at least one of the regular expressions from the following list

3

4DigitRule

True, if a token contains four or more digits in succession

FQM-3994, RYC-2008-03387, 20000H-1, MAT2010-21147, CO(0001)-CARBIDE, CO(111)/CO(0001), RU(0001) ELECTRODE

4

3DigitRule

True, if a token contains three digits in succession

215KMTA, 220ML, 148H-1, CU2O(111), AU{111}-CEO2{100}, MGO/AG(100)

2DigitRule

True, if a token begins with one or two digits

12C16O-13C16O, 31P{1H}, 2-PROPANOL, 2-METHYL-1-BUTENE, 3-METHYL-1,3-BUTADIENE, 15 %H3PW12O40/TIO2

5

UnitsRule

True, if a token ends with a string from the dictionary of measurement units (Table 1)

KJMOL-1, MMOL.MIN-1, KJ.MOL-1, G.GZEOLITE-1.H-1, CM3.MIN-1.G-1