Fig. 7

a Distribution of the y-ideal label versus binary y-labels for values close to bioactivity threshold. b Experimental error in ChEMBL for [Pyruvate dehydrogenase (acetyl-transferring)] kinase isozyme. We observe that the error is high when data are derived from different assay IDs and IC50 measurements. c–e Performance of the PRF versus RF classifier using different evaluation metrics and different thresholds on algorithms probabilities and y-ideal labels