Skip to main content

Table 2 Compounds-per-protein and per-document.

From: Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds

Database or subset

Document

count

Protein ID type

Total

proteins

Human

proteins

Cpds-per-protein

Cpds-per-document

GVKBIO

87747

Entrez Gene

3292

1468

604

22

GVKBIO journals

51810

Entrez Gene

2660

1146

239

12

GVKBIO patents

35937

Entrez Gene

1765

952

815

40

GVKBIO DD

26825

Entrez Gene

733

339

5

0.14

GVKBIO CCD

27286

Entrez Gene

1224

610

7

0.32

WOMBAT

10205

Swiss-Prot

1979

1095

91

18

DrugBank

n/a

Swiss-Prot

1625

1356

3

n/a

PubChem actives

n/a

RefSeq

72

n/a

104

n/a

PubChem PDB

n/a

RefSeq

818

n/a

14

n/a

BindingDB

1142

Swiss-Prot

297

97

112

19

MDDR

137754

n/a

n/a

n/a

n/a

1.4

DNP

7765

n/a

n/a

n/a

n/a

18

  1. Column three is the type of protein identifier used for the count of all species (column four) and human proteins (column five). In columns six and seven the filtered compound totals are taken from Additional file 1. The compound ratios are calculated with respect to total proteins and documents. For boxes labelled n/a the information was either not applicable or not available. For reference we have included a compounds-per-protein calculation for the PubChem actives subset even though there are no document-protein links analogous to the other sources.