Skip to main content

Table 2 Average percentage improvement between RF and PRF probabilities in relation to ideal y-label values across different emulated train-test standard deviations (SDs) when pChEMBL threshold equals 5

From: Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty

Standard deviation in train and test set

y-ideal range (N)

Better- performing Algorithm

% improvement

SD-train: 0.0–0.4 & SD-test: 0.0–0.4

0.0–0.2 (183,255)

PRF

4.79

0.2–0.4 (79,890)

PRF

3.83

0.4–0.6 (124,505)

PRF

10.8

0.6–0.8 (166,210)

PRF

5.76

0.8–1.0 (1,007,685)

RF

6.57

SD-train: 0.4–0.8 & SD-test:0.4–0.8

0.0–0.2 (152,835)

PRF

0.27

0.2–0.4 (194,300)

PRF

9.27

0.4–0.6 (339,495)

PRF

16.89

0.6–0.8

(592,575)

PRF

11.04

0.8–1.0 (5,624,495)

RF

9.59