Skip to main content

Advertisement

Table 1 Summary of 2D model performance

From: Use of structure-activity landscape index curves and curve integrals to evaluate the performance of multiple machine learning prediction models

Model ANNE SVM MLR KPLS RF PLS ANNE AZ ANNE Random
Training set         
MAE 0.19 0.22 0.20 0.19 0.08 0.22 0.20 0.19
Kendall τ 0.63 0.58 0.61 0.63 0.86 0.60 0.60 0.62
SCI 0.12 0.20 0.90 0.48 0.94 0.12 0.17 -0.13
S(0) 0.63 0.57 0.60 0.62 0.86 0.58 0.59 0.62
S(1) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.00
Test Set         
MAE 0.22 0.25 0.23 0.24 0.22 0.25 0.21 0.22
Kendall τ 0.51 0.45 0.48 0.48 0.51 0.45 0.53 0.56
SCI 0.83 0.93 0.93 0.94 0.94 0.75 0.96 -0.67
S(0) 0.52 0.46 0.50 0.50 0.52 0.46 0.54 0.56
S(1) 1.00 1.00 1.00 1.00 1.00 1.00 1.00 -1.00
Prospective Set         
MAE 0.36 0.32 0.52 0.19 * 0.33 0.36 0.35
Kendall τ 0.34 0.36 0.14 0.37 * 0.32 0.36 0.33
SCI 0.98 0.98 0.72 0.78 * 0.97 0.77 0.98
S(0) 0.35 0.37 0.15 0.38 * 0.33 0.37 0.35
S(1) 1.00 1.00 1.00 1.00 * 1.00 1.00 1.00
  1. Models using ADMET 2D Predictor descriptors and Kohonen map: ANNE, ADMET Predictor neural net; SVM, ADMET Predictor support vector machine; MLR, ADMET Predictor multiple linear regression; KPLS, ADMET Predictor kernel partial least squares; RF, Pipeline Pilot random forest; PLS, SIMCA-P+ partial least squares; ANNE AZ, ADMET Predictor neural net with AZ descriptors; ANNE Random, ADMET Predictor neural net with randomized choice of training/test sets. The performance properties of the models were calculated as described in CALCULATIONS AND STATISTICS. The properties were not calculated for RF since prediction outliers could not be identified.