Skip to main content

Table 2 Summary of the three physicochemical datasets

From: Similarity-based pairing improves efficiency of siamese neural networks for regression tasks and uncertainty quantification

Property

Lipophilicity

Freesolv

ESOL

Data set size

4200

642

1128

Mean property value

2.19

− 3.80

− 3.05

Standard deviation

1.20

3.85

2.10

Estimated experimental uncertainty (σ)

0.2

0.3

0.3

Double transformation cycles

169

7389

8731

Cycles with significant NA (> 2σ)

40 (23.7%)

2241 (30.3%)

2660 (30.5%)

Compounds with significant NA (> 2σ)

83 (2.0%)

99 (15.4%)

94 (8.4%)

Compounds with strong NA (> 4σ)

26 (0.6%)

29 (4.5%)

13 (1.2%)