Skip to main content

Table 3 Mean performance of similarity methods.

From: Large scale study of multiple-molecule queries

Method

AUC

F1

BEDROC

MIN-RANK

0.981265 ± 0.004540

0.480749 ± 0.019308

0.915781 ± 0.012475

MAX-RANK

0.633951 ± 0.038076

0.030289 ± 0.009587

0.209204 ± 0.049180

SUM-RANK

0.840620 ± 0.032520

0.128860 ± 0.032596

0.490227 ± 0.066911

MAX-SIM

0.973312 ± 0.005642

0.484504 ± 0.022397

0.893180 ± 0.015592

MIN-SIM

0.717104 ± 0.034874

0.053943 ± 0.015827

0.284041 ± 0.058423

SUM-SIM

0.914782 ± 0.018230

0.341373 ± 0.032341

0.719190 ± 0.041269

NUMDEN-SIM

0.907632 ± 0.019609

0.327810 ± 0.037178

0.696596 ± 0.044681

BAYES

0.910909 ± 0.017386

0.149837 ± 0.033785

0.581197 ± 0.050633

BKD

0.980763 ± 0.004517

0.501197 ± 0.019569

0.890840 ± 0.017745

ETD

0.987087 ± 0.002653

0.508081 ± 0.020886

0.922371 ± 0.011330

TPD

0.986616 ± 0.002649

0.451587 ± 0.025795

0.906017 ± 0.014098

SUM-EH

0.935054 ± 0.010774

0.296279 ± 0.032160

0.699456 ± 0.037583

SUM-ET

0.974831 ± 0.005798

0.491106 ± 0.022314

0.897401 ± 0.015760

SUM-TP

0.974963 ± 0.005751

0.490653 ± 0.022311

0.897621 ± 0.015771

  1. The mean performance of similarity methods across the 24 data sets with the ChemDB background. A confidence interval is provided with each measurement. The best performance in each column is listed in bold face, and all performances statistically indistinguishable (with a t-test yielding a p-value > 0.05) are listed in italics.