Skip to main content

Table 1 Overview of the proteochemometric datasets modeled in this work

From: Proteochemometric modeling in a Bayesian framework

 

Adenosine receptors

Dengue virus NS3 Proteases

Aminergic GPCRs

Datapoints

10,999

199

24,593

Sequences

8

4

91

Ligands

4,419

56

11,121

Source Organisms

H. sapiens and Rattus norvegicus

Dengue virus

H. sapiens, Rattus norvegicus, Mus musculus, Bos taurus, Sus scrofa, Canis familiaris, Cavia porcellus, Chlorocebus aethiops, and Mesocricetus auratus

Bioactivity

p K i

K c a t

p K i

Matrix Completeness (%)

31.11

88.84

2.43

  1. Whereas the compound-target interaction matrix of the dengue virus NS3 proteases dataset is almost complete (88.84%), the adenosine receptors and GPCRs dataset are more challenging to model given: (i) their sparsity (31.11 and 2.43% of matrix completness respectively), and (ii) the consideration of information from human orthologues, being the respective number of different sequences 8 and 91.