Skip to main content

Table 1 Counts of compounds (CIDs), assays (AIDs), proteins (GIs), and pathways (BSIDs) with PubChem structure–activity relationship (SAR) clustering results as a function of similarity type

From: PubChem structure–activity relationship (SAR) clusters

 

Initial

3-D clusters

2-D clusters

Any clusters

ST ST-opt

ComboT ST-opt

CT CT-opt

ComboT CT-opt

Number of CIDs

 Assay-centric clusters (from Set A)

843,845

669,504 (79.3%)

746,042 (88.4%)

747,969 (88.6%)

747,586 (88.6%)

802,383 (95.1%)

829,279 (98.3%)

 Protein-centric clusters (from Set B)

400,599

313,282 (78.2%)

356,954 (89.1%)

360,200 (89.9%)

357,543 (89.3%)

382,737 (95.5%)

397,197 (99.2%)

 Pathway-centric clusters (from Set C)

265,470

213,738 (80.5%)

243,006 (91.5%)

245,215 (92.4%)

243,378 (91.7%)

257,170 (96.9%)

264,338 (99.6%)

Number of UIDs

 Assay-centric clusters (from Set A)

548,071

218,789 (39.9%)

244,381 (44.6%)

245,334 (44.8%)

246,625 (45.0%)

264,311 (48.2%)

274,435 (50.1%)

 Protein-centric clusters (from Set B)

4,280

3,340 (78.0%)

3,419 (79.9%)

3,438 (80.3%)

3,428 (80.1%)

3,620 (84.6%)

3,660 (85.5%)

 Pathway-centric clusters (from Set C)

4,540

3,973 (87.5%)

4,073 (89.7%)

4,097 (90.2%)

4,089 (90.1%)

4,149 (91.4%)

4,168 (91.8%)

  1. Numbers in parentheses are percentages of the counts relative to the total count initially considered for each cluster set type. The UID represents AID, GI, and BSID for assay-, protein-, and pathway-centric clusters, respectively.