Skip to main content

Table 3 Overlaps between compound sets

From: Profiling and analysis of chemical compounds using pointwise mutual information

  DrugBank ChEMBL PubChem ZINC
DrugBank 6496 0.307% 0.008% 0.002%
ChEMBL 4647 1,512,302 1.895% 0.279%
PubChem 5854 1,313,209 69,081,967 6.280%
ZINC 3421 443,794 13,412,856 157,914,301
  1. The counts of unique overlapping compounds are shown in the lower triangle, compound set size on the diagonal and the overlap between two compound sets, given as the Jaccard index, in the upper triangle. The Jaccard index J(A, B) between compound sets A and B is calculated as the size of the intersection between A and B divided by the size of the union of A and B: \(J(A,B)=\frac{\left|A \cap B\right|}{\left|A \cup B\right|}\)