Skip to main content

Table 3 Performance of machine learning algorithms for prediction of the solubility of compounds in organic solvents

From: Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms

Machine learning algorithms

MAE

MSE

R2

Hold out

Five-fold cross-validation

Hold out

Five-fold cross-validation

Hold out

Five-fold cross-validation

Training set

Validation set

Mean  ±  Std.

Training set

Validation set

Mean  ±  Std.

Training set

Validation set

Mean  ±  Std.

PLS

0.4508

0.4808

0.4908 ± 0.0125

0.3889

0.4541

0.4636 ± 0.0171

0.7806

0.7444

0.7382 ± 0.0122

Ridge regression

0.4459

0.4748

0.4845 ± 0.0122

0.3843

0.4470

0.4511 ± 0.0165

0.7832

0.7484

0.7454 ± 0.0095

kNN

0.2185

0.3382

0.3300 ± 0.0080

0.2089

0.4260

0.4142 ± 0.0232

0.8822

0.7602

0.7662 ± 0.0145

DT

0.2098

0.3343

0.3391 ± 0.0117

0.1368

0.3153

0.3443 ± 0.0310

0.9229

0.8225

0.8059 ± 0.0147

ET

0.1916

0.2745

0.2757 ± 0.0066

0.1085

0.2210

0.2286 ± 0.0134

0.9388

0.8756

0.8710 ± 0.0063

RF

0.1541

0.2443

0.2466 ± 0.0083

0.0900

0.2008

0.2147 ± 0.0126

0.9492

0.8870

0.8789 ± 0.0060

SVM

0.0954

0.2168

0.2173 ± 0.0112

0.0364

0.2086

0.1996 ± 0.0190

0.9795

0.8826

0.8873 ± 0.0106

DNN

0.0960

0.2102

0.2124 ± 0.0294

0.0335

0.1766

0.1548 ± 0.0215

0.9811

0.9006

0.9192 ± 0.0115

lightGBM

0.1085

0.2027

0.2029 ± 0.0074

0.0389

0.1554

0.1522 ± 0.0110

0.9781

0.9125

0.9142 ± 0.0053