Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets

Hinselmann, Georg; Fechner, Nikolas; Jahn, A; Zell, Andreas

doi:10.1186/1758-2946-2-S1-P36

Volume 2 Supplement 1

5th German Conference on Cheminformatics: 23. CIC-Workshop

Poster presentation
Open access
Published: 04 May 2010

Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets

Georg Hinselmann¹,
Nikolas Fechner¹,
A Jahn¹ &
…
Andreas Zell¹

Journal of Cheminformatics volume 2, Article number: P36 (2010) Cite this article

1377 Accesses
Metrics details

The spatial arrangement of a chemical compound plays an important role regarding the related properties or activities. A straightforward approach to encode the geometry is to enumerate pairwise spatial relationships between k substructures, like functional groups or subgraphs. This leads to a combinatorial explosion with the number of features of interest and redundant information. The goal of this work is to compute all possible k-subsets of spatial points and to extract a single canonical descriptor for each subset in sub-polynomial computation time. More precisely, the problem is to reduce the complexity of n_k= n·(n - 1)...(n - k) possible relationships (patterns or descriptors) for n features and k-point relationships.

We propose a two-step algorithm to solve this problem. A modified algorithm for the computation of the binomial coefficient computes the k-subsets [1] containing the possible combinations of the n relevant features. If a k-subset is completed in the inner recursion, the algorithm computes a canonical representation for it. By defining a natural order by means of the geometrical center of gravity of the k points, we extract k patterns that describe the distance to the center of gravity and type of the spatial feature k ∈ F. Then, the algorithm returns a unique identifier for the lexicographically sorted array of patterns. If applicable (), an additional identifier is added which has the form , where d_ijdenotes the geometrical distance between features i, j. Else (), this step is omitted. Therefore, this approach also considers stereochemistry. Finally, one feature is returned for each k-subset resulting in a set of C(n, k) patterns describing the structure.

The main result is that the number of features is reduced from n_kto C(n, k), which equals the binomial coefficient. This procedure is useful in combination with similarity approaches that use spatial relationships, like pharmacophore searches, fingerprints, or graph kernels. We experimentally validated the algorithm on numerous QSAR benchmark sets in combination with the pharmacophore kernel [2].

References

Rolfe T: SIGCSE Bull. 2001, 33 (3): 35-36. 10.1145/571922.571950.
Article Google Scholar
Mahé P, Ralaivola L, Stoven V, Vert J-P: J Chem Inf Mod. 2006, 46 (5): 2003-2014. 10.1021/ci060138m.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Tübingen, Sand 1, 72076, Tübingen, Germany
Georg Hinselmann, Nikolas Fechner, A Jahn & Andreas Zell

Authors

Georg Hinselmann
View author publications
You can also search for this author in PubMed Google Scholar
Nikolas Fechner
View author publications
You can also search for this author in PubMed Google Scholar
A Jahn
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Zell
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Hinselmann, G., Fechner, N., Jahn, A. et al. Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets. J Cheminform 2 (Suppl 1), P36 (2010). https://doi.org/10.1186/1758-2946-2-S1-P36

Download citation

Published: 04 May 2010
DOI: https://doi.org/10.1186/1758-2946-2-S1-P36

5th German Conference on Cheminformatics: 23. CIC-Workshop

Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

5th German Conference on Cheminformatics: 23. CIC-Workshop

Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us