Skip to main content

Table 1 Data set description—data set sizes discriminated by positives and negatives within training/testing data split

From: A visual approach for analysis and inference of molecular activity spaces

Target protein name

Uniprot ID

Training set

Test set

Mean distance

Distance std. dev.

Positives

Negatives

Positives

Negatives

Sigma non-opioid intracellular receptor 1 (SIGMAR1)

Q99720

46

135

10

35

0.79

0.13

Histamine H1 receptor (HRH1)

P35367

184

783

46

195

0.83

0.08

Potassium voltage-gated channel subfamily H member 2 (HERG)

Q12809

39

1142

12

283

0.84

0.06

D(1B) dopamine receptor (DRD5)

P21918

41

231

5

62

0.80

0.10

  1. Includes the average computational distance between compounds of each data set and its respective standard deviation