Skip to main content

Table 4 Nearest neighbor analysis of the human metabolome database

From: One molecular fingerprint to rule them all: drugs, biomolecules, and the metabolome

HMBD subset

All

OH = 0

OH = 1

1 < OH ≤ 4

OH > 4

All

96,456

33,721

10,663

41,493

10,579

JD (MAP4-1024) = 0

0

0

0

0

0

JD (AP) = 0

1677

13

35

1611

18

JD (TT) = 0

68,623

27,897

5782

32,909

2035

JD (MHFP6-1024) = 0

69,972

28,502

6215

33,359

1996

JD (ECFP4-1024) = 0

70,329

28,561

6243

33,294

2231

  1. Subsets of the Human Metabolome 4.0 Database according to the number of hydroxyl groups per molecule separating lipids (OH = 0, 1) from carbohydrate derivatives (OH > 4). For each subset (column), the number of molecules is indicated in total (All, line 2) and counting those with an indistinguishable nearest neighbor (Jaccard Distance JD = 0) according to the indicated fingerprint (line 3–7). Molecules were considered after removing stereochemical information