Journal of Cheminformatics

Table 1 Nested CEM pairs in the development set of the CHEMDNER corpus.

From: A CRF-based system for recognizing chemical entity mentions (CEMs) in biomedical literature

			Offset		Offset
ID	PMID	T/A	Start	End	Start	End
1	23064325	A	1138	1152	1138	1149
2	23353756	A	12	65	29	65
3	23425199	T	50	66	61	66
		A	56	72	67	72
4	23298577	A	365	381	378	381
5	23368735	A	83	103	97	103
		A	108	119	118	119
6	23562534	A	944	950	944	946
7	23288867	A	1625	1641	1625	1632
8	23500769	A	410	418	410	414
9	23435367	A	118	133	118	125
10	22401710	A	688	696	688	691
11	23350627	A	96	111	101	111
		A	117	130	122	130
		A	467	507	473	475
		A	467	507	482	483
12	23453838	A	467	507	504	507
		A	632	646	640	641
		A	767	782	773	774
		A	843	847	845	847
13	23401298	A	438	502	438	501
14	23567043	A	436	450	444	450
15	23425199	T	50	66	50	60
		A	56	72	56	66
16	22313530	A	306	364	307	364
17	23368735	A	83	103	87	93
		A	108	119	112	114
18	23229510	A	645	739	646	738
19	23562534	A	944	950	947	950
20	23294378	A	584	604	585	603
21	23295645	T	0	33	10	33
22	23435367	A	963	978	963	970
23	23350627	A	96	111	96	100
		A	117	130	117	121
24	23453838	A	467	507	469	471
		A	467	507	479	480
		A	467	507	495	502
		A	632	646	634	636
		A	767	782	769	771
		A	767	782	779	782
		A	909	913	911	913

For each row, the CEM with offset in column 6-7 is nested in the CEM with offset in column 4-5. The CEMs with respective offsets in column 6-7 are omitted directly when training our CRF models.

Back to article page

ISSN: 1758-2946

Contact us

Submission enquiries: journalsubmissions@springernature.com