The similarity score frequencies (left axis) between each pair (colored boxes) of MCS solutions derived by the CDK-Fingerprint and the SMSD algorithm were sorted into bins ranging from 0 to 1 increasing in 0.01 increments. The cumulative percentage of the overall dataset for each bin is also shown (curves and right axis). Data shown in blue correspond to results from the SMSD algorithm while those in lilac correspond to CDK-Fingerprint. It is clear from the graph that the reported frequency of the SMSD similarity is different from the fingerprint similarity between the molecules. A good cut-off Tanimoto similarity score for reporting significant matches seems to be above 0.77 (at 99.9 percentile of the curve) for Fingerprint based searches and the MCS based search (indicated by the rightmost set of dashed lines).