From: Structural diversity of biologically interesting datasets: a scaffold analysis approach
Dataset | Total no. of molecules (in clustered dataset) | % of molecules failing Ro5 in clustered datasets | % of molecules failing Ro5 in randomly selected subset |
---|---|---|---|
Drugs | 3788 | 25.7 | 23.0 |
Metabolites | 6124 | 68.0 | 20.0* |
Toxics | 2166 | 26.5 | 21.5 |
NPs | 61972 | 16.2 | 15.0 |
Leads | 67983 | 19.8 | 19.5 |
NCI | 161336 | 19.5 | 15.5 |
ChEMBL | 379827 | 36.4 | 36.0 |