Fig. 10From: SIMPD: an algorithm for generating simulated time splits for validating machine learning approachesA: Spatial statistics summary plot \(\sum G\) against \(\sum F'\) for the 99 CHEMBL32 data sets with both random (gray triangles) and SIMPD (orange circles) training/test splits. The objectives used in the GA were \(10< \sum G - \sum F' < 30\) and \(\sum G > 70\). B: Histograms of the deviations in the observed training-test descriptor differences from their target values for the SIMPD splits of the 99 ChEMBL32 data sets. The objective used by the MOGA for each of these descriptors was 0Back to article page