Skip to main content
Fig. 10 | Journal of Cheminformatics

Fig. 10

From: SIMPD: an algorithm for generating simulated time splits for validating machine learning approaches

Fig. 10

A: Spatial statistics summary plot \(\sum G\) against \(\sum F'\) for the 99 CHEMBL32 data sets with both random (gray triangles) and SIMPD (orange circles) training/test splits. The objectives used in the GA were \(10< \sum G - \sum F' < 30\) and \(\sum G > 70\). B: Histograms of the deviations in the observed training-test descriptor differences from their target values for the SIMPD splits of the 99 ChEMBL32 data sets. The objective used by the MOGA for each of these descriptors was 0

Back to article page