Skip to main content

Advertisement

Table 1 Nested CEM pairs in the development set of the CHEMDNER corpus.

From: A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature

    Offset Offset
ID PMID T/A Start End Start End
1 23064325 A 1138 1152 1138 1149
2 23353756 A 12 65 29 65
3 23425199 T 50 66 61 66
   A 56 72 67 72
4 23298577 A 365 381 378 381
5 23368735 A 83 103 97 103
   A 108 119 118 119
6 23562534 A 944 950 944 946
7 23288867 A 1625 1641 1625 1632
8 23500769 A 410 418 410 414
9 23435367 A 118 133 118 125
10 22401710 A 688 696 688 691
11 23350627 A 96 111 101 111
   A 117 130 122 130
   A 467 507 473 475
   A 467 507 482 483
12 23453838 A 467 507 504 507
   A 632 646 640 641
   A 767 782 773 774
   A 843 847 845 847
13 23401298 A 438 502 438 501
14 23567043 A 436 450 444 450
15 23425199 T 50 66 50 60
   A 56 72 56 66
16 22313530 A 306 364 307 364
17 23368735 A 83 103 87 93
   A 108 119 112 114
18 23229510 A 645 739 646 738
19 23562534 A 944 950 947 950
20 23294378 A 584 604 585 603
21 23295645 T 0 33 10 33
22 23435367 A 963 978 963 970
23 23350627 A 96 111 96 100
   A 117 130 117 121
24 23453838 A 467 507 469 471
   A 467 507 479 480
   A 467 507 495 502
   A 632 646 634 636
   A 767 782 769 771
   A 767 782 779 782
   A 909 913 911 913
  1. For each row, the CEM with offset in column 6-7 is nested in the CEM with offset in column 4-5. The CEMs with respective offsets in column 6-7 are omitted directly when training our CRF models.