Dataset | RF_PLEC_4 | RF_PLEC_4.5 | RF_PLEC_5 | PointVS |
---|
ZINC | 0.89 | 0.86 | 0.78 | 0.85 |
DUDE-AA2AR | 0.86 | 0.76 | 0.73 | 0.84 |
DUDE-DRD3 | 0.83 | 0.68 | 0.63 | 0.69 |
DUDE-FA10 | 0.86 | 0.67 | 0.56 | 0.82 |
DUDE-MK14 | 0.82 | 0.63 | 0.61 | 0.73 |
DUDE-VGFR2 | 0.77 | 0.62 | 0.56 | 0.61 |
LIT-ALDH1 | 0.86 | 0.81 | 0.73 | 0.93 |
LIT-FEN1 | 0.83 | 0.69 | 0.62 | 0.87 |
LIT-MAPK1 | 0.81 | 0.61 | 0.59 | 0.70 |
LIT-PKM2 | 0.81 | 0.73 | 0.67 | 0.82 |
LIT-VDR | 0.85 | 0.71 | 0.65 | 0.88 |
Random | 0.503 | 0.503 | 0.503 | 0.503 |
- The best performing dataset for each model is highlighted in bold
- The RF_PLEC models trained on the unbiased ZINC_Polar dataset consistently attained a larger Attribution AUC than those trained on datasets susceptible to ligand-specific bias, suggesting that if a model learns to classify examples based on ligand-specific features, its ability to learn the true binding rule is impaired
- By contrast, the highest Attribution AUC attained by PointVS was when training on the LIT-ALDH1 dataset, potentially illustrating that in some instances it was able to learn the deterministic binding rule in the presence of ligand-specific bias