Skip to main content
Fig. 10 | Journal of Cheminformatics

Fig. 10

From: Combatting over-specialization bias in growing chemical databases

Fig. 10

Dividing the Tox21 dataset into a training set, a pool, and a test set, we train a classifier on either the training set only, the training set together with the entire pool, the training set plus cancels-based compound selection, and the training set plus a selection that feeds the biases instead of mitigating it. The box plot (left) displays the results in terms of accuracy when evaluating the trained models on the test set. A confidence interval plot (right) indicates that compound selection using cancels is significantly better than all other options

Back to article page