From: Structure–activity relationship-based chemical classification of highly imbalanced Tox21 datasets
Metrics | Classifier | NR-AR | NR-AR-LBD | NR-AhR | NR-Aromatase | NR-ER | NR-ER-LBD | NR-PPAR-γ | SR-ARE | SR-ATAD5 | SR-HSE | SR-MMP | SR-p53 | Mean | CVa (%) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
F1 score | RF | 0.1538 | 0.0000 | 0.4340 | 0.2326 | 0.2727 | 0.2400 | 0.0606 | 0.3359 | 0.2500 | 0.2500 | 0.5106 | 0.1364 | 0.2397 | 60 |
RUS | 0.1176 | 0.1667 | 0.4507 | 0.2222 | 0.2605 | 0.1849 | 0.4051 | 0.4185 | 0.2063 | 0.1058 | 0.5867 | 0.2527 | 0.2815 | 53 | |
SMO | 0.2500 | 0.0000 | 0.3883 | 0.1905 | 0.3692 | 0.2857 | 0.1765 | 0.2927 | 0.2439 | 0.1905 | 0.3902 | 0.1395 | 0.2431 | 47 | |
SMN | 0.1951 | 0.1111 | 0.5856 | 0.5070 | 0.6078 | 0.3636 | 0.3929 | 0.6791 | 0.3636 | 0.2400 | 0.5850 | 0.4225 | 0.4211 | 42 | |
MCC | RF | 0.2859 | − 0.0050 | 0.4101 | 0.3202 | 0.2726 | 0.2891 | 0.0767 | 0.2770 | 0.3377 | 0.2619 | 0.4701 | 0.1801 | 0.2647 | 49 |
RUS | 0.1056 | 0.1602 | 0.4209 | 0.1914 | 0.1816 | 0.1908 | 0.3810 | 0.2950 | 0.2049 | 0.1190 | 0.5537 | 0.2769 | 0.2568 | 53 | |
SMO | 0.2805 | − 0.0071 | 0.3669 | 0.2792 | 0.3990 | 0.3018 | 0.2355 | 0.2498 | 0.3091 | 0.2327 | 0.3662 | 0.2019 | 0.2679 | 39 | |
SMN | 0.1886 | 0.0975 | 0.5342 | 0.4711 | 0.5643 | 0.3404 | 0.3627 | 0.6177 | 0.3261 | 0.2226 | 0.5492 | 0.3872 | 0.3885 | 42 | |
AUROC | RF | 0.8232 | 0.7963 | 0.9063 | 0.7356 | 0.7601 | 0.6963 | 0.6640 | 0.7867 | 0.7827 | 0.7610 | 0.9194 | 0.7443 | 0.7813 | 10 |
RUS | 0.6785 | 0.9133 | 0.8852 | 0.7627 | 0.7174 | 0.7619 | 0.7937 | 0.7698 | 0.7791 | 0.7065 | 0.9295 | 0.8168 | 0.7929 | 10 | |
SMO | 0.7780 | 0.7509 | 0.8936 | 0.8112 | 0.7296 | 0.8072 | 0.7872 | 0.7714 | 0.8151 | 0.7983 | 0.8893 | 0.8510 | 0.8069 | 6 | |
SMN | 0.6810 | 0.7969 | 0.9196 | 0.8500 | 0.8628 | 0.8233 | 0.7713 | 0.8910 | 0.8093 | 0.8483 | 0.9294 | 0.8785 | 0.8384 | 8 | |
AUPRC | RF | 0.3521 | 0.0565 | 0.5846 | 0.2825 | 0.3203 | 0.1887 | 0.1120 | 0.4224 | 0.2881 | 0.1608 | 0.5632 | 0.1881 | 0.2933 | 57 |
RUS | 0.1444 | 0.1068 | 0.4836 | 0.2043 | 0.2420 | 0.1545 | 0.5067 | 0.4140 | 0.2423 | 0.0622 | 0.5237 | 0.2295 | 0.2762 | 59 | |
SMO | 0.3290 | 0.0821 | 0.5065 | 0.3504 | 0.3895 | 0.2658 | 0.2806 | 0.4052 | 0.3350 | 0.1993 | 0.4928 | 0.2913 | 0.3273 | 36 | |
SMN | 0.0685 | 0.0639 | 0.5660 | 0.3845 | 0.5688 | 0.2018 | 0.3736 | 0.6443 | 0.2422 | 0.1134 | 0.5234 | 0.3254 | 0.3396 | 60 | |
Balanced accuracy (BA) | RF | 0.5417 | 0.4991 | 0.6518 | 0.5665 | 0.5830 | 0.5732 | 0.5146 | 0.6016 | 0.5726 | 0.5847 | 0.7053 | 0.5368 | 0.5776 | 10 |
RUS | 0.5929 | 0.6124 | 0.8129 | 0.6828 | 0.6513 | 0.6968 | 0.7454 | 0.6977 | 0.7133 | 0.6665 | 0.8523 | 0.7777 | 0.7085 | 11 | |
SMO | 0.5815 | 0.4982 | 0.6304 | 0.5530 | 0.6181 | 0.5964 | 0.5499 | 0.5833 | 0.5718 | 0.5571 | 0.6354 | 0.5377 | 0.5761 | 7 | |
SMN | 0.6443 | 0.5544 | 0.8228 | 0.7265 | 0.7922 | 0.6858 | 0.6753 | 0.8545 | 0.7018 | 0.6529 | 0.8452 | 0.6812 | 0.7198 | 13 | |
Precision | RF | 1.0000 | 0.0000 | 0.6389 | 0.8333 | 0.5294 | 0.6000 | 0.2500 | 0.5116 | 0.8333 | 0.4286 | 0.6000 | 0.5000 | 0.5604 | 48 |
RUS | 0.0769 | 0.1250 | 0.2991 | 0.1302 | 0.1604 | 0.1111 | 0.3200 | 0.2869 | 0.1193 | 0.0576 | 0.4583 | 0.1464 | 0.1909 | 64 | |
SMO | 0.5000 | 0.0000 | 0.6061 | 0.8000 | 0.7500 | 0.5000 | 0.6000 | 0.5143 | 0.7143 | 0.5000 | 0.5714 | 0.6000 | 0.5547 | 36 | |
SMN | 0.1379 | 0.1000 | 0.4775 | 0.5294 | 0.5849 | 0.3333 | 0.4074 | 0.5748 | 0.2963 | 0.1818 | 0.4624 | 0.4545 | 0.3784 | 44 | |
Recall or Sensitivity | RF | 0.0833 | 0.0000 | 0.3286 | 0.1351 | 0.1837 | 0.1500 | 0.0345 | 0.2500 | 0.1471 | 0.1765 | 0.4444 | 0.0789 | 0.1677 | 75 |
RUS | 0.2500 | 0.2500 | 0.9143 | 0.7568 | 0.6939 | 0.5500 | 0.5517 | 0.7727 | 0.7647 | 0.6471 | 0.8148 | 0.9211 | 0.6573 | 34 | |
SMO | 0.1667 | 0.0000 | 0.2857 | 0.1081 | 0.2449 | 0.2000 | 0.1034 | 0.2045 | 0.1471 | 0.1176 | 0.2963 | 0.0789 | 0.1628 | 54 | |
SMN | 0.3333 | 0.1250 | 0.7571 | 0.4865 | 0.6327 | 0.4000 | 0.3793 | 0.8295 | 0.4706 | 0.3529 | 0.7963 | 0.3947 | 0.4965 | 43 | |
Brier score (BS) | RF | 0.3817 | 0.5425 | 0.3404 | 0.3997 | 0.3883 | 0.4163 | 0.3961 | 0.3725 | 0.3947 | 0.4257 | 0.3215 | 0.3810 | 0.3967 | 14 |
RUS | 0.4461 | 0.3874 | 0.3104 | 0.3724 | 0.3793 | 0.4299 | 0.3204 | 0.3735 | 0.3829 | 0.4871 | 0.3892 | 0.3936 | 0.3894 | 13 | |
SMO | 0.4263 | 0.6739 | 0.3281 | 0.3379 | 0.4205 | 0.4067 | 0.4138 | 0.3881 | 0.3924 | 0.4146 | 0.3467 | 0.3814 | 0.4109 | 22 | |
SMN | 0.4303 | 0.4156 | 0.2583 | 0.3327 | 0.3134 | 0.3670 | 0.3503 | 0.2761 | 0.3431 | 0.3491 | 0.2371 | 0.3014 | 0.3312 | 18 | |
Sensitivity–specificity gap (SSG)b | RF | 0.9167 | 0.9982 | 0.6464 | 0.8628 | 0.7987 | 0.8464 | 0.9601 | 0.7031 | 0.8511 | 0.8165 | 0.5217 | 0.9157 | 0.8198 | 17 |
RUS | 0.6857 | 0.7249 | 0.2028 | 0.1480 | 0.0851 | 0.2937 | 0.3874 | 0.1499 | 0.1027 | 0.0388 | 0.0750 | 0.2867 | 0.2651 | 87 | |
SMO | 0.8297 | 0.9964 | 0.6893 | 0.8898 | 0.7463 | 0.7929 | 0.8930 | 0.7576 | 0.8494 | 0.8789 | 0.6783 | 0.9175 | 0.8266 | 12 | |
SMN | 0.6221 | 0.8588 | 0.1314 | 0.4800 | 0.3189 | 0.5716 | 0.5920 | 0.0500 | 0.4625 | 0.6000 | 0.0978 | 0.5730 | 0.4465 | 55 | |
Averagec | RF | 0.2157 | − 0.0215 | 0.3297 | 0.2048 | 0.1928 | 0.1638 | 0.0396 | 0.2344 | 0.2184 | 0.1535 | 0.3744 | 0.1187 | 0.1854 | 59 |
RUS | 0.0927 | 0.1358 | 0.4171 | 0.2700 | 0.2714 | 0.2140 | 0.3329 | 0.3479 | 0.2827 | 0.2043 | 0.4728 | 0.3045 | 0.2788 | 39 | |
SMO | 0.1811 | − 0.0385 | 0.2956 | 0.2072 | 0.2593 | 0.1953 | 0.1585 | 0.2084 | 0.2105 | 0.1447 | 0.2907 | 0.1557 | 0.1890 | 46 | |
SMN | 0.1329 | 0.0638 | 0.4748 | 0.3491 | 0.4424 | 0.2455 | 0.2689 | 0.5294 | 0.2671 | 0.1848 | 0.4840 | 0.2966 | 0.3116 | 47 |