Skip to main content

Table 1 Datasets used in this work, with the number of molecules, endpoint, endpoint units, range, and reference for each

From: The effect of noise on the predictive limit of QSAR models

Dataset

Category

Entriesa

Endpoint

Range

Refs.

G298_atom

Quantum mechanical

131,082

ΔGoat (kcal mol−1)

− 2417to − 288

[29, 30]

Alpha

Quantum mechanical

131,082

α (Bohr3)

9.0 to 27.8

[29, 30]

Lip

Physiochemical

4200

logD

− 1.5 to 4.5

[31]

Solv

Physiochemical

642

ΔGohyd (kcal mol−1)

− 25.5 to 3.4

[32]

BACE

Biochemical

1513

pIC50

2.5 to 10.5

[33]

Tox_102b

Toxicological in vitro

971

logAC50

− 2.1 to 4.7

[28]

Tox_134c

Toxicological in vitro

1347

logAC50

− 4.0 to 2.8

[28]

LD50

Toxicological in vivo

5003

logLD50 (mg kg−1)

− 1.9 to 4.8

[35]

  1. aOriginal size of the dataset. If datasets have more than 1000 molecules, they were randomly sampled down to a size of 1000 before modeling
  2. bIncludes data exclusively from the ATG-PPre-cis assay
  3. cIncludes data exclusively from the ATG-PPARg-trans assay