Skip to main content

Table 1 Nested CEM pairs in the development set of the CHEMDNER corpus.

From: A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature

   

Offset

Offset

ID

PMID

T/A

Start

End

Start

End

1

23064325

A

1138

1152

1138

1149

2

23353756

A

12

65

29

65

3

23425199

T

50

66

61

66

  

A

56

72

67

72

4

23298577

A

365

381

378

381

5

23368735

A

83

103

97

103

  

A

108

119

118

119

6

23562534

A

944

950

944

946

7

23288867

A

1625

1641

1625

1632

8

23500769

A

410

418

410

414

9

23435367

A

118

133

118

125

10

22401710

A

688

696

688

691

11

23350627

A

96

111

101

111

  

A

117

130

122

130

  

A

467

507

473

475

  

A

467

507

482

483

12

23453838

A

467

507

504

507

  

A

632

646

640

641

  

A

767

782

773

774

  

A

843

847

845

847

13

23401298

A

438

502

438

501

14

23567043

A

436

450

444

450

15

23425199

T

50

66

50

60

  

A

56

72

56

66

16

22313530

A

306

364

307

364

17

23368735

A

83

103

87

93

  

A

108

119

112

114

18

23229510

A

645

739

646

738

19

23562534

A

944

950

947

950

20

23294378

A

584

604

585

603

21

23295645

T

0

33

10

33

22

23435367

A

963

978

963

970

23

23350627

A

96

111

96

100

  

A

117

130

117

121

24

23453838

A

467

507

469

471

  

A

467

507

479

480

  

A

467

507

495

502

  

A

632

646

634

636

  

A

767

782

769

771

  

A

767

782

779

782

  

A

909

913

911

913

  1. For each row, the CEM with offset in column 6-7 is nested in the CEM with offset in column 4-5. The CEMs with respective offsets in column 6-7 are omitted directly when training our CRF models.