Skip to main content

Table 4 Performances of machine learning algorithms in the three classes of samples in the validation subset

From: Prediction of small-molecule compound solubility in organic solvents by machine learning algorithms

Machine learning algorithms Unseen solutes Seen solutes, but not seen the same combination of solutes and solvents Seen the same combination of solute and solvent, but at different temperatures
MAE MSE R2 MAE MSE R2 MAE MSE R2
PLS 0.8467 1.2267 0.1707 0.6155 0.7112 0.6538 0.4408 0.3758 0.7781
Ridge regression 0.5909 0.8410 0.4314 0.6128 0.7123 0.6532 0.4403 0.3762 0.7779
kNN 1.2722 2.6338 -0.7806 0.8734 1.4487 0.2947 0.1919 0.1360 0.9197
DT 0.9560 1.5480 -0.0466 0.6946 0.9234 0.5505 0.2360 0.1446 0.9146
ET 0.7860 1.0640 0.2807 0.5514 0.6021 0.7069 0.1982 0.1124 0.9337
RF 0.7655 1.0518 0.2889 0.5436 0.5724 0.7213 0.1624 0.0939 0.9446
SVM 0.7959 1.3027 0.1193 0.5677 0.6375 0.6896 0.1216 0.0827 0.9512
DNN 0.8494 1.1523 0.2210 0.4897 0.5073 0.7530 0.1300 0.0762 0.9550
lightGBM 0.5968 0.7511 0.4922 0.4729 0.4243 0.7934 0.1307 0.0788 0.9535