Skip to main content

Table 3 Overlaps between compound sets

From: Profiling and analysis of chemical compounds using pointwise mutual information

 

DrugBank

ChEMBL

PubChem

ZINC

DrugBank

6496

0.307%

0.008%

0.002%

ChEMBL

4647

1,512,302

1.895%

0.279%

PubChem

5854

1,313,209

69,081,967

6.280%

ZINC

3421

443,794

13,412,856

157,914,301

  1. The counts of unique overlapping compounds are shown in the lower triangle, compound set size on the diagonal and the overlap between two compound sets, given as the Jaccard index, in the upper triangle. The Jaccard index J(A, B) between compound sets A and B is calculated as the size of the intersection between A and B divided by the size of the union of A and B: \(J(A,B)=\frac{\left|A \cap B\right|}{\left|A \cup B\right|}\)