Skip to main content

Table 1 Class distribution and imbalance ratio (IR) of the preprocessed training and test chemical datasets from Tox21 Data Challenge

From: Structure–activity relationship-based chemical classification of highly imbalanced Tox21 datasets

In vitro qHTS assay ID Total number of chemicals Training set Test set
Inactive Active IR Inactive Active IR
NR-AR 6436 5698 166 34.3 560 12 46.7
NR-AR-LBD 5931 5223 143 36.5 557 8 69.6
NR-AhR 5596 4445 561 7.9 520 70 7.4
NR-Aromatase 4901 4193 193 21.7 478 37 12.9
NR-ER 5171 4167 500 8.3 455 49 9.3
NR-ER-LBD 6043 5239 221 23.7 563 20 28.2
NR-PPAR-γ 5712 5005 120 41.7 558 29 19.2
SR-ARE 4808 3669 603 6.1 448 88 5.1
SR-ATAD5 6320 5515 203 27.2 568 34 16.7
SR-HSE 5529 4733 206 23.0 573 17 33.7
SR-MMP 4955 3763 666 5.7 472 54 8.7
SR-p53 6009 5110 303 16.9 558 38 14.7
  1. The highest and lowest IRs for the training and test sets are in bold