Fig. 6

Performance of the four classes of Random Forest classifiers trained on the dataset, quantified by ROC AUC, average precision, and the sensitivity, selectivity and CCR achieved at the optimal prediction threshold, across the three training-test set splits. It can be observed that the best-performing class of models were those utilising molecular descriptors alone or in combination with protein target descriptors. The random test-training split afforded the best-performing models, while performance predicting the toxicity of the rare scaffolds and single source test sets was lower