Skip to main content
Fig. 3 | Journal of Cheminformatics

Fig. 3

From: The influence of solid state information and descriptor selection on statistical models of temperature dependent aqueous solubility

Fig. 3

The application of a standard, or “vanilla”, cross-validation protocol (fivefold CV) to temperature dependent endpoint data, where the instance IDs comprise the [MATERIAL IDENTITY]_[TEMPERATURE]. As shown here, instances corresponding to the same material, yet with endpoint values measured at different temperatures, might be assigned to different folds. (For this hypothetical dataset, this means [M1]_[T = 25] and [M1]_[T = 30] were assigned to folds F1 and F2 respectively.) Since each fold is used, in turn, as the test set, with the remaining data being used as the training set, this allows the same material to appear in corresponding training and test sets, when the corresponding endpoint values were measured at different temperatures

Back to article page