Skip to main content

Table 3 Performance of median-based consensus classifiers, errors are absolute (unsigned) and are measured in log S units

From: Can human experts predict solubility better than computers?

Compound ML error Human error Difference
4-Aminobenzoic acid 0.07 0.13 − 0.06
4-Aminosalicylic acid 0.23 0.76 − 0.53
Antipyrine 3.73 2.98 0.75
Chloramphenicol 0.35 0.39 − 0.04
Corticosterone 0.11 0.06 0.05
Dapsone 0.54 0.29 0.25
Primidone 0.06 0.14 − 0.08
Estrone 0.87 0.82 0.05
Alclofenac 0.30 0.12 0.18
5-Fluorouracil 0.46 0.62 − 0.16
Griseofulvin 0.44 0.25 0.19
Fluometuron 0.53 0.04 0.49
Fluconazole 1.09 0.70 0.39
Khellin 0.17 0.98 − 0.81
Clozapine 1.37 0.71 0.66
Norethisterone 0.63 0.63 0.00
Nicotinic acid 0.58 0.35 0.23
Perphenazine 0.16 0.16 0.00
Pteridine 2.22 3.02 − 0.80
Salicylamide 0.23 0.49 − 0.26
Sulfanilamide 0.54 0.14 0.40
Gliclazide 1.03 0.80 0.23
Trihexyphenidyl 1.98 1.45 0.53
Triphenylene 0.15 0.27 − 0.12
Mifepristone 1.57 2.00 − 0.43
Average 0.778 0.732 0.046
  1. The difference is meaningfully signed, with a positive value where the human median-based classifier performed better on that compound and a negative value where the machine learning median-based classifier performed better