Skip to main content

Table 2 Fingerprint target data set sizes in FPS format

From: The chemfp project

Data set#BitsFingerprint type#Fingerprints (in millions)UniqueFPS size (in MiB)FPS.gz size (in MiB)
chemfp benchmark
ChEMBL 23 subset
166OpenEye MACCS1.0083.6%5417.7
chemfp benchmark
PubChem subset
881PubChem/CACTVS1.0098.222253.1
chemfp benchmark
ChEMBL 23 subset
1021Open Babel FP21.0096.025880.5
chemfp benchmark
ChEMBL 23 subset
2048RDKit Morgan1.0090.650259.9
ChEMBL 242048RDKit Morgan1.8294.191499.7
PubChem881PubChem/CACTVS96.965.321,5002910
  1. “Unique” is the number of distinct fingerprints as a percentage of the total number of fingerprints