Skip to main content

Table 2 Compounds-per-protein and per-document.

From: Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds

Database or subset Document
count
Protein ID type Total
proteins
Human
proteins
Cpds-per-protein Cpds-per-document
GVKBIO 87747 Entrez Gene 3292 1468 604 22
GVKBIO journals 51810 Entrez Gene 2660 1146 239 12
GVKBIO patents 35937 Entrez Gene 1765 952 815 40
GVKBIO DD 26825 Entrez Gene 733 339 5 0.14
GVKBIO CCD 27286 Entrez Gene 1224 610 7 0.32
WOMBAT 10205 Swiss-Prot 1979 1095 91 18
DrugBank n/a Swiss-Prot 1625 1356 3 n/a
PubChem actives n/a RefSeq 72 n/a 104 n/a
PubChem PDB n/a RefSeq 818 n/a 14 n/a
BindingDB 1142 Swiss-Prot 297 97 112 19
MDDR 137754 n/a n/a n/a n/a 1.4
DNP 7765 n/a n/a n/a n/a 18
  1. Column three is the type of protein identifier used for the count of all species (column four) and human proteins (column five). In columns six and seven the filtered compound totals are taken from Additional file 1. The compound ratios are calculated with respect to total proteins and documents. For boxes labelled n/a the information was either not applicable or not available. For reference we have included a compounds-per-protein calculation for the PubChem actives subset even though there are no document-protein links analogous to the other sources.