From: Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms
Machine learning algorithms | MAE | MSE | R2 | ||||||
---|---|---|---|---|---|---|---|---|---|
Hold out | Five-fold cross-validation | Hold out | Five-fold cross-validation | Hold out | Five-fold cross-validation | ||||
Training set | Validation set | Mean ± Std. | Training set | Validation set | Mean ± Std. | Training set | Validation set | Mean ± Std. | |
PLS | 0.4508 | 0.4808 | 0.4908 ± 0.0125 | 0.3889 | 0.4541 | 0.4636 ± 0.0171 | 0.7806 | 0.7444 | 0.7382 ± 0.0122 |
Ridge regression | 0.4459 | 0.4748 | 0.4845 ± 0.0122 | 0.3843 | 0.4470 | 0.4511 ± 0.0165 | 0.7832 | 0.7484 | 0.7454 ± 0.0095 |
kNN | 0.2185 | 0.3382 | 0.3300 ± 0.0080 | 0.2089 | 0.4260 | 0.4142 ± 0.0232 | 0.8822 | 0.7602 | 0.7662 ± 0.0145 |
DT | 0.2098 | 0.3343 | 0.3391 ± 0.0117 | 0.1368 | 0.3153 | 0.3443 ± 0.0310 | 0.9229 | 0.8225 | 0.8059 ± 0.0147 |
ET | 0.1916 | 0.2745 | 0.2757 ± 0.0066 | 0.1085 | 0.2210 | 0.2286 ± 0.0134 | 0.9388 | 0.8756 | 0.8710 ± 0.0063 |
RF | 0.1541 | 0.2443 | 0.2466 ± 0.0083 | 0.0900 | 0.2008 | 0.2147 ± 0.0126 | 0.9492 | 0.8870 | 0.8789 ± 0.0060 |
SVM | 0.0954 | 0.2168 | 0.2173 ± 0.0112 | 0.0364 | 0.2086 | 0.1996 ± 0.0190 | 0.9795 | 0.8826 | 0.8873 ± 0.0106 |
DNN | 0.0960 | 0.2102 | 0.2124 ± 0.0294 | 0.0335 | 0.1766 | 0.1548 ± 0.0215 | 0.9811 | 0.9006 | 0.9192 ± 0.0115 |
lightGBM | 0.1085 | 0.2027 | 0.2029 ± 0.0074 | 0.0389 | 0.1554 | 0.1522 ± 0.0110 | 0.9781 | 0.9125 | 0.9142 ± 0.0053 |