Skip to main content

Table 1 Class distribution and imbalance ratio (IR) of the preprocessed training and test chemical datasets from Tox21 Data Challenge

From: Structure–activity relationship-based chemical classification of highly imbalanced Tox21 datasets

In vitro qHTS assay ID

Total number of chemicals

Training set

Test set

Inactive

Active

IR

Inactive

Active

IR

NR-AR

6436

5698

166

34.3

560

12

46.7

NR-AR-LBD

5931

5223

143

36.5

557

8

69.6

NR-AhR

5596

4445

561

7.9

520

70

7.4

NR-Aromatase

4901

4193

193

21.7

478

37

12.9

NR-ER

5171

4167

500

8.3

455

49

9.3

NR-ER-LBD

6043

5239

221

23.7

563

20

28.2

NR-PPAR-γ

5712

5005

120

41.7

558

29

19.2

SR-ARE

4808

3669

603

6.1

448

88

5.1

SR-ATAD5

6320

5515

203

27.2

568

34

16.7

SR-HSE

5529

4733

206

23.0

573

17

33.7

SR-MMP

4955

3763

666

5.7

472

54

8.7

SR-p53

6009

5110

303

16.9

558

38

14.7

  1. The highest and lowest IRs for the training and test sets are in bold