Skip to main content

Advertisement

Table 5 PubChem descriptor model performance for split chemical space validation sets

From: Feature combination networks for the interpretation of statistical machine learning models: application to Ames mutagenicity

Algorithm Measure Aliphatic halogen Aromatic nitro Aziridine Bay region PAH Carboxylic acid Epoxide Aromatic amine (primary) Aromatic amine (secondary) Aromatic amine (tertiary)
SVM AUC 0.83 0.78 --- 0.73 0.96 0.85 0.82 0.91 0.77
BAC 0.77 0.64 NaN 0.58 0.91 0.77 0.75 0.82 0.73
SEN 0.83 0.96 1.00 0.97 0.88 0.87 0.86 0.82 0.83
SPEC 0.71 0.32 NaN 0.20 0.94 0.67 0.64 0.82 0.64
RF AUC 0.8 0.8 --- 0.72 0.94 0.87 0.84 0.95 0.82
BAC 0.75 0.64 NaN 0.70 0.9 0.73 0.74 0.85 0.68
SEN 0.87 0.96 1.00 1.00 0.88 0.92 0.9 0.88 0.78
SPEC 0.63 0.32 NaN 0.40 0.92 0.54 0.58 0.82 0.57
DT AUC 0.66 0.58 --- 0.65 0.87 0.56 0.71 0.82 0.57
BAC 0.66 0.55 NaN 0.50 0.82 0.54 0.63 0.82 0.70
SEN 0.7 0.96 0.92 1.0 0.74 1.00 0.76 0.65 0.83
SPEC 0.63 0.13 NaN 0.00 0.9 0.08 0.50 1.00 0.57
kNN AUC 0.77 0.79 --- 0.74 0.91 0.70 0.78 0.88 0.78
BAC 0.73 0.64 NaN 0.5 0.82 0.65 0.69 0.8 0.70
SEN 0.69 0.96 1.00 1.00 0.79 0.71 0.85 0.88 0.83
SPEC 0.77 0.32 NaN 0.00 0.85 0.58 0.53 0.73 0.57
  1. Where NaN = not a number result as all predictions were true positive, AUC = area under curve, BAC = balanced accuracy, SEN = sensitivity, SPEC = specificity.