Fig. 4From: Splitting chemical structure data sets for federated privacy-preserving machine learningDistribution of compounds over different folds depending on similarity of these compounds. Fraction of intra-fold pairs as function of the Tc ECFP6 similarity of this pair a for public data set and b averaged over 4 pharma data sets (confidence intervals indicated as bars). In a the decadic logarithm of the number of pairs (bold black line) as function of the Tc ECFP6 similarity of this pair is given in additionBack to article page