Exploring the ability of machine learning-based virtual screening models to identify the functional groups responsible for binding

Table 4 Attribution AUC obtained by the different RF_PLEC models and PointVS on the different Polar datasets

Dataset	RF_PLEC_4	RF_PLEC_4.5	RF_PLEC_5	PointVS
ZINC	0.89	0.86	0.78	0.85
DUDE-AA2AR	0.86	0.76	0.73	0.84
DUDE-DRD3	0.83	0.68	0.63	0.69
DUDE-FA10	0.86	0.67	0.56	0.82
DUDE-MK14	0.82	0.63	0.61	0.73
DUDE-VGFR2	0.77	0.62	0.56	0.61
LIT-ALDH1	0.86	0.81	0.73	0.93
LIT-FEN1	0.83	0.69	0.62	0.87
LIT-MAPK1	0.81	0.61	0.59	0.70
LIT-PKM2	0.81	0.73	0.67	0.82
LIT-VDR	0.85	0.71	0.65	0.88
Random	0.503	0.503	0.503	0.503

The best performing dataset for each model is highlighted in bold
The RF_PLEC models trained on the unbiased ZINC_Polar dataset consistently attained a larger Attribution AUC than those trained on datasets susceptible to ligand-specific bias, suggesting that if a model learns to classify examples based on ligand-specific features, its ability to learn the true binding rule is impaired
By contrast, the highest Attribution AUC attained by PointVS was when training on the LIT-ALDH1 dataset, potentially illustrating that in some instances it was able to learn the deterministic binding rule in the presence of ligand-specific bias

ISSN: 1758-2946