Skip to main content

Table 3 Fingerprint data set sizes in FPB format and largest chunk sizes

From: The chemfp project

Data set#Bits#Fingerprints (in millions)FPB size (in MiB)AREN size (in MiB)FPID size (in MiB)HASH size (in MiB)
chemfp benchmark1661.0054.022.915.915.3
chemfp benchmark8811.0013410711.615.3
chemfp benchmark10211.0015312215.915.3
chemfp benchmark20481.0027524415.915.3
ChEMBL 2420481.8250144429.927.8
PubChem88196.913,00010,30011301480
  1. The AREN chunk contains the fingerprints, the FPID chunk contains record identifiers indexed by position, and the HASH chunk contains a hash table mapping identifiers to index