Skip to main content

Advertisement

Table 3 Nested Cross-validation MSE Performance on the Sutherland Data Sets

From: jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints

Encoding ACE ACHE BZR COX2 DHFR GPB THERM THR
DFS 1.73 ± 0:74 0.66 ± 0.28 0.61 ± 0.31 1.13 ± 0.30 0.57 ± 0.21 0.66 ± 0.54 2.10 ± 1.31 0.60 ± 0.38
ASP 1.70 ± 0.72 0.62 ± 0.26 0.53 ± 0.27 1.11 ± 0.30 0.58 ± 0.21 0.63 ± 0.48 2.09 ± 1.32 0.59 ± 0.38
AP2D 1.50 ± 0.70 0.85 ± 0.37 0.70 ± 0.37 1.03 ± 0.30 0.73 ± 0.29 0.61 ± 0.45 2.19 ± 1.20 0.50 ± 0.31
AT2D 1.57 ± 0.69 0.74 ± 0.34 0.69 ± 0.35 0.97 ± 0.27 0.66 ± 0.30 0.60 ± 0.47 1.97 ± 1.20 0.49 ± 0.32
CATS2D 1.76 ± 0.72 0.93 ± 0.33 0.89 ± 0.45 1.35 ± 0.43 0.69 ± 0.19 0.64 ± 0.45 2.28 ± 1.14 0.52 ± 0.32
PHAP2PT2D 1.77 ± 0.71 0.96 ± 0.33 0.91 ± 0.45 1.38 ± 0.44 0.72 ± 0.20 0.65 ± 0.48 2.18 ± 1.10 0.53 ± 0.31
PHAP3PT2D 1.81 ± 0.69 0.96 ± 0.33 0.82 ± 0.39 1.23 ± 0.41 0.67 ± 0.21 0.56 ± 0.49 1.89 ± 1.16 0.57 ± 0.37
SHED 2.08 ± 0.76 1.05 ± 0.50 1.09 ± 0.46 1.64 ± 0.48 1.49 ± 0.35 0.70 ± 0.33 2.71 ± 1.54 0.49 ± 0.28
ECFP 1.80 ± 0.77 0.72 ± 0.29 0.66 ± 0.32 1.01 ± 0.28 0.57 ± 0.20 0.68 ± 0.55 2.19 ± 1.36 0.51 ± 0.33
RAD2D 1.87 ± 0.75 0.77 ± 0.33 0.79 ± 0.37 1.08 ± 0.30 0.71 ± 0.27 0.72 ± 0.59 2.20 ± 1.33 0.50 ± 0.35
LSTAR 1.97 ± 0.79 0.72 ± 0.29 0.69 ± 0.30 1.04 ± 0.27 0.62 ± 0.19 0.76 ± 0.61 2.31 ± 1.39 0.50 ± 0.31
AP3D 1.60 ± 0.69 0.69 ± 0.32 0.59 ± 0.32 0.93 ± 0.27 0.67 ± 0.23 0.67 ± 0.51 2.73 ± 1.35 0.57 ± 0.33
AT3D 1.77 ± 0.68 0.64 ± 0.28 0.67 ± 0.36 0.99 ± 0.28 0.57 ± 0.18 0.74 ± 0.60 2.75 ± 1.37 0.60 ± 0.28
CATS3D 1.75 ± 0.70 0.90 ± 0.38 0.81 ± 0.36 1.31 ± 0.41 0.73 ± 0.20 0.79 ± 0.49 2.47 ± 1.32 0.62 ± 0.32
PHAP2PT3D 1.75 ± 0.70 0.87 ± 0.36 0.81 ± 0.40 1.32 ± 0.41 0.73 ± 0.20 0.83 ± 0.54 2.53 ± 1.30 0.65 ± 0.33
PHAP3PT3D 1.99 ± 0.77 0.82 ± 0.29 0.81 ± 0.36 1.14 ± 0.37 0.59 ± 0.17 0.86 ± 0.69 2.84 ± 1.46 0.69 ± 0.30
RAD3D 2.17 ± 0.78 0.78 ± 0.31 0.73 ± 0.35 1.10 ± 0.30 0.57 ± 0.17 0.82 ± 0.69 2.79 ± 1.37 0.58 ± 0.31
  1. Overview of ϵ support vector MSE regression performance on the Sutherland data set using the MinMax kernel function. The performance was evaluated using a nested 10-fold cross-validation repeated 20 times. Bold values indicate the best encodings according to the number of signicant better comparisons (p ≤ 0.05) against the other encodings. The p-values were determined by the corrected resampled t-test [32].