Structural diversity of biologically interesting datasets: a scaffold analysis approach

Table 4 Scaffolds shared between pairs of clustered datasets.

Datasets	D	M	T	P	L	N	C
D	100%	123 (6%; D: 7%, M: 42%)	192 (7.5%; D: 10%, T: 21%)	347 (2.4%; D: 19%, P: 3%)	310 (1.4%; D: 17%, L: 1%)	840 (2%; D: 45%, N: 2%)	1347 (1.0%; D: 72%, C: 1%)
M		100%	71 (6.3%; M: 24%, T: 8%)	140 (1.1%; M: 47%, P: 1%)	68 (0.3%; M: 23%, L: 0.3%)	230 (0.5%; M: 78%, N: 0.5%)	215 (0.2%, M: 73%, C: 0.2%)
T			100%	174 (1.3%; T: 19%, P: 1%)	144 (0.7%; T: 16%, L: 1%)	534 (1.2%, T: 59%, N: 1%)	532 (0.4%, T: 59%, C: 0.4%)
P				100%	706 (2.1%; P: 5%, L: 3%)	1734 (3.1%; P: 13%, L: 8%)	1947 (1.4%, P: 15%, C: 1.5%)
L					100%	2753 (4.4%; L: 13%, N: 6%)	3470 (2.4%; L: 16%, C: 3%)
N						100%	7600 (5.0%; N: 17%, C: 6%)
C							100%

The overall percentage of shared scaffolds is given in the brackets, along with percentages of shared scaffolds from each contributing dataset.
D: Drugs, M: Metabolites, T: Toxics, P: Natural Products, L: Leads, N: NCI, C: ChEMBL.

ISSN: 1758-2946