Skip to main content


Fig. 2 | Journal of Cheminformatics

Fig. 2

From: A probabilistic molecular fingerprint for big data settings

Fig. 2

Results of benchmarking hashing methods across 88 benchmark targets. Hashed molecular shingling with \(r = 2\) (orange, solid) and \(r = 3\) (orange, dashed) are both ranked better than ECFP4/6 (green) and ECFP4/6* (purple) in AUC. However, only hashed molecular shingling with \(r = 3\) was ranked better than all other fingerprints in every metric (AUC, EF1, EF5, BEDROC20, BEDROC100, RIE20, and RIE100). The control, a variant of ECFP, ECFP* (purple), considering only atomic numbers as invariants, performed significantly worse than both hashed molecular shingling and ECFP. Pairwise post hoc Friedman tests of the average rank were performed as part of the benchmark, resulting p values shown in Additional file 1: Fig. S5

Back to article page