Kernel-based estimation of the applicability domain of QSAR models

Fechner, Nikolas; Hinselmann, Georg; Jahn, A; Zell, A

doi:10.1186/1758-2946-2-S1-P38

Volume 2 Supplement 1

5th German Conference on Cheminformatics: 23. CIC-Workshop

Poster presentation
Open access
Published: 04 May 2010

Kernel-based estimation of the applicability domain of QSAR models

Nikolas Fechner¹,
Georg Hinselmann¹,
A Jahn¹ &
…
A Zell¹

Journal of Cheminformatics volume 2, Article number: P38 (2010) Cite this article

2038 Accesses
1 Citations
Metrics details

Machine learning techniques have become a valuable tool to assess molecular properties without the need of in vitro experiments. Most of these methods do not give any information if a molecule that is predicted can be sufficiently described by the knowledge contained in the model. Thus, the estimation of the reliability of a model-based prediction is an important question in machine learning based QSAR modeling.

One approach to solve this problem is to describe the portion of the chemical space used during the training phase of a model. Any molecule included in the same subspace is then considered as a structure for which the model is regarded as valid. This concept of the description of the subspace in which a model is regarded as reliable is known as the estimation of the applicability domain of this model [1].

Most machine learning approaches for QSAR rely on a vectorial representation of the molecules. The applicability domain is expressed as a subspace of the vector space with one dimension for each descriptor used. This concept can be not directly applied to kernel-based techniques like support vector machines. These methods rely on an implicit feature space that is only defined by the applied kernel similarity and with unknown dimensions. The applicability domain of a kernel-based model therefore has to be defined by means of the kernel. Consequently, this allows to use structured similarity measures, like the Optimal Assignment Kernel [2] and its extension [3], instead of a numerical encoding. Thus, it is possible to describe the complex chemical structure of many drugs better than it would be using descriptors.

In this work, several approaches to define the applicability domain of a QSAR model by means of a kernel are presented and compared to each other. The approach is to extend the concept of a kernel density estimation to incorporate further information contained in a trained model. This can be achieved by using a weighted average kernel similarity of a predicted molecule to the training data set. The weights can be obtained either by exploiting the knowledge contained in the learned model or by approaches that describe the feature space structure using the kernel.

References

Jaworska J, Nikolova-Jeliazkova N, Aldenberg T: Altern Lab Anim. 2005, 33: 445-459.
CAS Google Scholar
Fröhlich H, Wegner J, Sieker F, Zell A: QSAR & Comb Sci. 2006, 25: 317-326. 10.1002/qsar.200510135.
Article Google Scholar
Fechner N, Jahn A, Hinselmann G, Zell A: J Chem Inf Mod. 2009, 49: 549-560. 10.1021/ci800329r.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

University of Tübingen, Sand 1, 72076, Tübingen, Germany
Nikolas Fechner, Georg Hinselmann, A Jahn & A Zell

Authors

Nikolas Fechner
View author publications
You can also search for this author in PubMed Google Scholar
Georg Hinselmann
View author publications
You can also search for this author in PubMed Google Scholar
A Jahn
View author publications
You can also search for this author in PubMed Google Scholar
A Zell
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Fechner, N., Hinselmann, G., Jahn, A. et al. Kernel-based estimation of the applicability domain of QSAR models. J Cheminform 2 (Suppl 1), P38 (2010). https://doi.org/10.1186/1758-2946-2-S1-P38

Download citation

Published: 04 May 2010
DOI: https://doi.org/10.1186/1758-2946-2-S1-P38

5th German Conference on Cheminformatics: 23. CIC-Workshop

Kernel-based estimation of the applicability domain of QSAR models

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

5th German Conference on Cheminformatics: 23. CIC-Workshop

Kernel-based estimation of the applicability domain of QSAR models

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us