Skip to main content

Table 1 Model performance depends on datasets that are used in training

From: Identification of novel small molecule inhibitors for solute carrier SGLT1 using proteochemometric modeling

Model and validation

Training

Sensitivity

Specificity

PPV

NPV

MCC

QSAR (EV)

PD + IH

0.76

0.86

0.42

0.96

0.48

Public PCM (CV)

PD

0.01 ± 0.01

0.98 ± 0.00

0.03 ± 0.06

0.91 ± 0.01

− 0.03 ± 0.03

In-house PCM (CV)

IH

0.69 ± 0.07

0.89 ± 0.02

0.38 ± 0.06

0.97 ± 0.01

0.45 ± 0.05

Combined PCM (CV)

PD + IH

0.64 ± 0.06

0.93 ± 0.01

0.47 ± 0.07

0.96 ± 0.01

0.49 ± 0.05

  1. PD public data, IH in-house data, EV external validation on 30% of data, CV fivefold cross validation on 20% of the data per iteration