Skip to main content

Table 2 Average percentage improvement between RF and PRF probabilities in relation to ideal y-label values across different emulated train-test standard deviations (SDs) when pChEMBL threshold equals 5

From: Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty

Standard deviation in train and test set y-ideal range (N) Better- performing Algorithm % improvement
SD-train: 0.0–0.4 & SD-test: 0.0–0.4 0.0–0.2 (183,255) PRF 4.79
0.2–0.4 (79,890) PRF 3.83
0.4–0.6 (124,505) PRF 10.8
0.6–0.8 (166,210) PRF 5.76
0.8–1.0 (1,007,685) RF 6.57
SD-train: 0.4–0.8 & SD-test:0.4–0.8 0.0–0.2 (152,835) PRF 0.27
0.2–0.4 (194,300) PRF 9.27
0.4–0.6 (339,495) PRF 16.89
0.6–0.8
(592,575)
PRF 11.04
0.8–1.0 (5,624,495) RF 9.59