Skip to main content

Table 4 Attribution AUC obtained by the different RF_PLEC models and PointVS on the different Polar datasets

From: Exploring the ability of machine learning-based virtual screening models to identify the functional groups responsible for binding

Dataset

RF_PLEC_4

RF_PLEC_4.5

RF_PLEC_5

PointVS

ZINC

0.89

0.86

0.78

0.85

DUDE-AA2AR

0.86

0.76

0.73

0.84

DUDE-DRD3

0.83

0.68

0.63

0.69

DUDE-FA10

0.86

0.67

0.56

0.82

DUDE-MK14

0.82

0.63

0.61

0.73

DUDE-VGFR2

0.77

0.62

0.56

0.61

LIT-ALDH1

0.86

0.81

0.73

0.93

LIT-FEN1

0.83

0.69

0.62

0.87

LIT-MAPK1

0.81

0.61

0.59

0.70

LIT-PKM2

0.81

0.73

0.67

0.82

LIT-VDR

0.85

0.71

0.65

0.88

Random

0.503

0.503

0.503

0.503

  1. The best performing dataset for each model is highlighted in bold
  2. The RF_PLEC models trained on the unbiased ZINC_Polar dataset consistently attained a larger Attribution AUC than those trained on datasets susceptible to ligand-specific bias, suggesting that if a model learns to classify examples based on ligand-specific features, its ability to learn the true binding rule is impaired
  3. By contrast, the highest Attribution AUC attained by PointVS was when training on the LIT-ALDH1 dataset, potentially illustrating that in some instances it was able to learn the deterministic binding rule in the presence of ligand-specific bias