Skip to main content

Table 3 Summary of classification validation statistics across different methods and validation sets

From: A document classifier for medicinal chemistry publications trained on the ChEMBL corpus

Method/validation set

AUC

MCC

Sensitivity

Specificity

NB EV

0.98

0.88

0.90

0.97

NB n-grams EV

1.00

0.91

0.95

0.96

NB ChEMBL_17

0.96

0.90

0.92

0.98

NB BindingDB

0.97

0.79

0.80

0.97

RF EV

0.99

0.92

0.95

0.97

RF CV Out-of-Bag

0.99

0.92

0.94

0.97

  1. Abbreviations: AUC Area Under the Curve, CV cross validation, EV external validation, MCC Matthews Correlation Coefficient, NB Naive Bayesian, RF Random Forest.