Skip to main content

Table 2 Summary statistics of chemical structure descriptors

From: Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis

 

10-K set

156-K set

734-K set

Entire PubChem3D Contents

Heavy atom count

24.5 ± 6.4

25.1 ± 6.4

24.6 ± 6.4

26.3 ± 7.0

Rotatable bond count

5.5 ± 2.7

5.5 ± 2.8

5.5 ± 2.7

6.8 ± 3.0

Effective rotor count

6.1 ± 2.8

6.1 ± 2.9

6.1 ± 2.8

7.4 ± 3.0

RMSDthresh

0.8 ± 0.2

0.8 ± 0.2

0.8 ± 0.2

0.9 ± 0.3

Monopole volume (Å3)

475.4 ± 124.7

487.0 ± 123.3

474.1 ± 124.0

509.0 ± 137.1

Qx5)

13.8 ± 6.9

14.3 ± 7.2

12.6 ± 7.0

13.6 ± 7.8

Qy5)

3.5 ± 1.6

3.6 ± 1.6

3.3 ± 1.6

3.6 ± 1.8

Qz5)

1.4 ± 0.6

1.4 ± 0.6

1.3 ± 0.6

1.5 ± 0.6

Total feature count

8.1 ± 2.6

8.4 ± 2.7

8.1 ± 2.6

8.5 ± 2.7

Hydrogen- bond acceptor count

3.0 ± 1.6

2.9 ± 1.6

2.9 ± 1.6

3.0 ± 1.6

Hydrogen- bond donor count

1.1 ± 1.0

1.2 ± 1.0

1.1 ± 1.0

1.2 ± 1.0

Anion count

0.2 ± 0.4

0.2 ± 0.4

0.2 ± 0.4

0.2 ± 0.4

Cation count

0.6 ± 0.8

0.8 ± 0.9

0.6 ± 0.8

0.7 ± 0.9

Hydrophobe count

0.3 ± 0.6

0.3 ± 0.6

0.3 ± 0.6

0.5 ± 0.8

Ring count

3.0 ± 1.2

3.1 ± 1.2

3.0 ± 1.2

3.0 ± 1.3

  1. The average and standard deviation of heavy atom count, rotatable bond count, effective rotor count, sampling RMSD (RMSDthresh), monopole volume, three steric shape quadrupole components (Qx, Qy, and Qz), and feature counts (by total and each of the six feature types) for 10,000 randomly selected biologically tested compounds (10-K set), 156,232 non-inactive compounds (156-K set), 734,486 CIDs biologically tested compounds (734-K set) and the entire PubChem3D contents (26,157,365 CIDs as of September 2010). The data for the 734-K set and the entire PubChem3D contents are from Ref. [10]. The RMSDthresh and effective rotor count were computed using Equations (1) and (3), respectively [see the Methods section].