Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

Müller, Christoph; Ormsby, Daniel; Feierberg, Isabella; Engkvist, Ola; Tyrchan, Christian; Hartshorn, Michael J

doi:10.1186/1758-2946-6-S1-P27

Volume 6 Supplement 1

9th German Conference on Chemoinformatics

Poster presentation
Open access
Published: 11 March 2014

Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

Christoph Müller¹,
Daniel Ormsby¹,
Isabella Feierberg²,
Ola Engkvist²,
Christian Tyrchan² &
…
Michael J Hartshorn¹

Journal of Cheminformatics volume 6, Article number: P27 (2014) Cite this article

1569 Accesses
Metrics details

High throughput screening (HTS) is one of the most prominent techniques used in the beginning stages of a drug discovery programme to identify those few hit compounds that can be used as starting points in subsequent studies [1, 2]. However, an HTS experiment often entails a very data-intensive and challenging hit prioritization process that yields the mentioned hit compounds. The workflow described in this study aims to make this decision-making process easier by combining the structural and biological information of compounds used in an HTS. In particular, the workflow combines various clustering and nearest neighbourhood schemes with a non-parametric statistical test in order to prioritize those groupings of compounds that are likely of being relevant to the biological target of interest [3].

The novel workflow was evaluated under various aspects in a retrospective study using publicly available quantitative HTS (qHTS) datasets [4]. One of the main benchmarking aspects in this study was the ability to correctly identify as many true active compounds as possible. Therefore different chemical descriptors and clustering schemes were tested in combination with the statistic to measure their classification performance.

The workflow was integrated into Dotmatics’ Vortex, a platform for analysing chemical information using chemoinformatics methods and data visualisations tools [5]. This integration enables researchers to easily extend their current HTS workflow in order to discover new hit series and reveal hidden relationships between compounds, scaffolds and clusters.

References

Rocke D: Design and analysis of experiments with high throughput biological assay data. Cell and Developmental Biology. 2004, 15 (6): 703-713.
Article CAS Google Scholar
Keseru GM, Makara GM: Hit discovery and hit-to-lead approaches. Drug Discovery Today. 2006, 11 (15-16): 741-748. 10.1016/j.drudis.2006.06.016.
Article Google Scholar
Varin T, Gubler H, Parker CN, Zhang JH, Raman P, Ertl P, Schuffenhauer A: Compound set enrichment: A novel approach to analysis of primary HTS data. J Chem Inf Model. 2010, 50 (12): 2067-2078. 10.1021/ci100203e.
Article CAS Google Scholar
[http://www.ncbi.nlm.nih.gov]
Dotmatics Ltd: [http://www.dotmatics.com]

Download references

Author information

Authors and Affiliations

Dotmatics Ltd, Windhill, Bishop’s Stortford, CM23 2ND, UK
Christoph Müller, Daniel Ormsby & Michael J Hartshorn
AstraZeneca AB, Peppardsleden 1, Mölndal, 43183, Sweden
Isabella Feierberg, Ola Engkvist & Christian Tyrchan

Authors

Christoph Müller
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Ormsby
View author publications
You can also search for this author in PubMed Google Scholar
Isabella Feierberg
View author publications
You can also search for this author in PubMed Google Scholar
Ola Engkvist
View author publications
You can also search for this author in PubMed Google Scholar
Christian Tyrchan
View author publications
You can also search for this author in PubMed Google Scholar
Michael J Hartshorn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christoph Müller.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Müller, C., Ormsby, D., Feierberg, I. et al. Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations. J Cheminform 6 (Suppl 1), P27 (2014). https://doi.org/10.1186/1758-2946-6-S1-P27

Download citation

Published: 11 March 2014
DOI: https://doi.org/10.1186/1758-2946-6-S1-P27

9th German Conference on Chemoinformatics

Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

9th German Conference on Chemoinformatics

Hit series selection in noisy HTS data: clustering techniques, statistical tests and data visualisations

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us