Introducing fuzziness into maximum common substructures for meaningful cluster characterisation

Herhaus, Christian

doi:10.1186/1758-2946-6-S1-P17

Volume 6 Supplement 1

9th German Conference on Chemoinformatics

Poster presentation
Open access
Published: 11 March 2014

Introducing fuzziness into maximum common substructures for meaningful cluster characterisation

Christian Herhaus¹

Journal of Cheminformatics volume 6, Article number: P17 (2014) Cite this article

1316 Accesses
1 Citations
Metrics details

Arranging similar structures in clusters is one of the typical tasks of modern Chemoinformatics with high impact in HTS follow-up, generation of structure activity relationships (SAR) and selection of starting points for compound optimisation. Methods for cluster generation are as diverse as the structures which they are applied to [1], may they be e.g. similarity- or substructure-based. Typically, medicinal chemists tend to orientate themselves in structure subsets like clusters with the help of substructures, so-called "scaffolds", which intuitively characterise the structural relationships between the molecules of the subset. In the case of substructure-based clustering, well established methods are existing for the generation of Maximum Common Substructures (MCS) which are present in all members of the structure population or a defined proportion thereof [2]. But in the case of similarity-based clusters, such MCS may either not be existing for the required dataset proportion or the common substructure may be so small that it is no longer representative and therefore meaningless.

The approach presented here allows the generation of MCS also for similarity-based clusters with a given inherent structural diversity. It does so by generating an MCS of reduced graphs in a first step, followed by mapping atom and bond indexes of this reduced MCS onto the full structures and aggregation of atom and bond information for each indexed atom/bond. In a final step, query features of the MDL SDF format (atom lists, query bonds) are utilized to map aggregated element and bond information onto the reduced MCS. As a result, "fuzziness" in atom and bond information is added to the MCS which, although still being fully database-searchable, is more meaningful for the characterisation of clusters as it can cover larger parts of the full structures than a conventional MCS could do. The approach was implemented in Pipeline Pilot™ for proof of concept but is general enough to be transferred to other technical platforms as well.

References

Downs GM, Barnard JM: Clustering Methods and Their Uses in Computational Chemistry. Reviews in Computational Chemistry. 2003, Chichester: Wiley and Sons, 18: 1-40.
Google Scholar
Ehrlich HC, Rarey M: Maximum common subgraph isomorphism algorithms and their applications in molecular science: a review. WIREs Comput Mol Sci. 2011, 1: 68-79. 10.1002/wcms.5.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Global Computational Chemistry, Merck Serono, Darmstadt, 64293, Germany
Christian Herhaus

Authors

Christian Herhaus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Herhaus.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Herhaus, C. Introducing fuzziness into maximum common substructures for meaningful cluster characterisation. J Cheminform 6 (Suppl 1), P17 (2014). https://doi.org/10.1186/1758-2946-6-S1-P17

Download citation

Published: 11 March 2014
DOI: https://doi.org/10.1186/1758-2946-6-S1-P17

9th German Conference on Chemoinformatics

Introducing fuzziness into maximum common substructures for meaningful cluster characterisation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

9th German Conference on Chemoinformatics

Introducing fuzziness into maximum common substructures for meaningful cluster characterisation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us