Skip to main content

Table 1 Number of compounds (CIDs) in the data sets employed in the present study

From: Similar compounds versus similar conformers: complementarity between PubChem 2-D and 3-D neighboring sets

 

Associated filtersa

Series A

Series B

Ratio (B/A) (%)

PubChem

all

36,017,715

31,776,025

88.2

MeSH

pccompound_mesh

82,446

62,217

75.5

Protein3D

pccompound_structure

22,753

17,387

76.4

PharmAct

pccompound_mesh_pharm

11,415

6977

61.1

Drug

pccompound_drugs

1773

950

53.6

  1. The five data sets in Series A were generated using associated Entrez filters, which are used to restrict a search to a particular compound subset in PubChem. The five data sets in Series B were generated from their Series A counterparts by adding the parent compounds of the chemicals in the Series A data sets and then selecting those with a computed 3-D conformer description available
  2. aPubChem Compound Entrez filters allow users to retrieve CIDs that have a particular annotation type. For example, CIDs with “Drug” annotation can be retrieved via the URL: https://www.ncbi.nlm.nih.gov/pccompound/?term=pccompound_drugs[filter]