Skip to main content

Table 3 Performance of machine learning algorithms for prediction of the solubility of compounds in organic solvents

From: Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms

Machine learning algorithms MAE MSE R2
Hold out Five-fold cross-validation Hold out Five-fold cross-validation Hold out Five-fold cross-validation
Training set Validation set Mean  ±  Std. Training set Validation set Mean  ±  Std. Training set Validation set Mean  ±  Std.
PLS 0.4508 0.4808 0.4908 ± 0.0125 0.3889 0.4541 0.4636 ± 0.0171 0.7806 0.7444 0.7382 ± 0.0122
Ridge regression 0.4459 0.4748 0.4845 ± 0.0122 0.3843 0.4470 0.4511 ± 0.0165 0.7832 0.7484 0.7454 ± 0.0095
kNN 0.2185 0.3382 0.3300 ± 0.0080 0.2089 0.4260 0.4142 ± 0.0232 0.8822 0.7602 0.7662 ± 0.0145
DT 0.2098 0.3343 0.3391 ± 0.0117 0.1368 0.3153 0.3443 ± 0.0310 0.9229 0.8225 0.8059 ± 0.0147
ET 0.1916 0.2745 0.2757 ± 0.0066 0.1085 0.2210 0.2286 ± 0.0134 0.9388 0.8756 0.8710 ± 0.0063
RF 0.1541 0.2443 0.2466 ± 0.0083 0.0900 0.2008 0.2147 ± 0.0126 0.9492 0.8870 0.8789 ± 0.0060
SVM 0.0954 0.2168 0.2173 ± 0.0112 0.0364 0.2086 0.1996 ± 0.0190 0.9795 0.8826 0.8873 ± 0.0106
DNN 0.0960 0.2102 0.2124 ± 0.0294 0.0335 0.1766 0.1548 ± 0.0215 0.9811 0.9006 0.9192 ± 0.0115
lightGBM 0.1085 0.2027 0.2029 ± 0.0074 0.0389 0.1554 0.1522 ± 0.0110 0.9781 0.9125 0.9142 ± 0.0053