Skip to main content

Advertisement

Table 5 Databases used in this study

From: Structural diversity of biologically interesting datasets: a scaffold analysis approach

Datasets Number of molecules Clustered dataset Reference
Drugs DrugBank 1372 3788 [22]
  KEGG drugs 7057   [23]
Metabolites HMDB 7888 6124, 2072* [24]
  HumanCYC 984   [25]
  BiGG 730   [26]
Toxics DSSTox 582 2166 [27]
  FDA Carcinogenicity 125   [28]
  ITER 514   [30]
  SuperToxic 1097   [31]
NPs ZINC NP database 89425 61972 [32]
Leads BioNET 42699 67983 [33]
  Maybridge 60550   [34]
NCI NCI database 260071 161336 [39]
ChEMBL ChEMBL dataset 600625 379827 [36]
  1. *Metabolite dataset excluding lipids and large molecules (details in the Methods section).