Skip to main content
Fig. 2 | Journal of Cheminformatics

Fig. 2

From: Advancing material property prediction: using physics-informed machine learning models for viscosity

Fig. 2

Descriptor-based QSPR approaches for predicting viscosity. A Workflow of the descriptor-based approaches using methyl acetate as an example. Methyl acetate is featurized with RDKit, Morgan fingerprint, and Matminer descriptors. A total of 1341 + \(N_{ext}\) (external features) features were passed into machine learning model development. The inverse temperature is included in model development to incorporate temperature effects. B Five-fold cross validation and test set RMSE for QSPR models. The average RMSE is reported across five out-of-sample train-test splits and the RMSE uncertainty is estimated by computing the standard deviation across the splits. C Parity plot between predicted and actual log-viscosity showing the validation set predictions across 5-CV on the training set for a single train/test split when using the LGBM model, which had the highest model score based on Eq. 1. Each color indicates the different validation sets for each of the five folds. The number of examples used (N), \(R^2\), and RMSE for 5-CV are reported within the plot. D Parity plot between predicted and actual log viscosity for a single 80:20 train:test split for the LGBM model. The total number of examples used (N) and statistics (i.e. \(R^2\) and RMSE) for train and test sets are reported within the plot. For all parity plots, a dashed diagonal \(y = x\) line is drawn as a guide to indicate which predictions are in agreement with the actual values

Back to article page