From: Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Effect of data pretreatment for the three-way ANOVA (sigma restricted parameterization). The changes of SRD values can be seen in different combinations of the factors. The data scaling methods are on the x axis and the selection method was: (A) random draw; (B) diversity picking. With random draw, Substructure similarities produce significantly higher SRD values for the ranking of fragment-like compounds than for bigger molecules. Meanwhile, with diversity picked molecules, Euclidean (and also Manhattan) similarities exhibit a trend to produce higher SRD values (i.e. deviate more from the consensus) as the size of the molecules increases. Weighted means were used for the creation of the plot. The vertical bars denote 0.95 confidence intervals. (Manhattan and Soergel similarities were omitted for clarity).

