Skip to main content

Table 6 Performance measures for binary classification of mixin models, Q refers to relative quantity of NA compounds added to the training data

From: Nonadditivity in public and inhouse data: implications for drug design

ChEMBL data RF (MCC for test)
Q0 (0.0%)* Q1 (0.6%)* Median (1.3%)* Q3 (2.6%)*
1613777 DTC-split 0.02 0.04 − 0.04 0.02
  All-split 0.00 0.04 − 0.05 − 0.02
1613797 DTC-split − 0.03 0.07 0.20 − 0.12
  All-split 0.00 0.00 0.00 0.00
1614027 DTC-split 0.28 0.28 0.28 0.20
  All-split 0.22 0.04 − 0.05 0.11
  1. * Test set size for Q0 differs from Q1/Median/Q3.