Combining SFCscore with Random Forests leads to improved affinity prediction for protein-ligand complexes

Zilian, D; Sotriffer, CA

doi:10.1186/1758-2946-5-S1-P27

Volume 5 Supplement 1

8th German Conference on Chemoinformatics: 26 CIC-Workshop

Poster presentation
Open access
Published: 22 March 2013

Combining SFCscore with Random Forests leads to improved affinity prediction for protein-ligand complexes

D Zilian¹ &
CA Sotriffer¹

Journal of Cheminformatics volume 5, Article number: P27 (2013) Cite this article

1601 Accesses
2 Citations
Metrics details

SFCscore is a collection of emprirical scoring functions derived from a set of over 60 descriptors for protein-ligand complexes of known structure [1]. By the time of their derivation, SFCscore functions were the best-performing scoring functions tested on large heterogeneous data sets, but the overall correlation was still not within the desired range. Similarly, despite the ever increasing amount of structure and affinity data, the general advancements in the development of empirical scoring functions have been rather moderate over the past years. However, more recently, Ballester and Mitchell [2] published a function that outperformed current state-of-the-art scoring functions when tested against the PDBbind benchmark set [3]. This function uses relatively simple atom contact counts as descriptors and is derived by the Random Forest algorithm. Here, we present a study in which we used Random Forests to derive a new function ("SFCscoreRF") based on the SFCscore descriptors as input data. Although this is not a fully non-parametric approach, the descriptors are supposed to capture more accurately the physically relevant interactions. We tested the new function against the PDBbind benchmark set and the CSAR-NRC HiQ 2010 set [4] and, in addition, performed the Leave-Cluster-Out validation as proposed by Kramer and Gedeck for the PDBbind set [5]. The results suggest that the new function significantly improves the predictive power of SFCscore, as it increases the correlation between predicted and experimentally determined affinities for the PDBbind benchmark set from r² = 0.41 (best previous SFCscore function) to r² = 0.61 (SFCscoreRF) and for the CSAR data set from r² = 0.38 to r² = 0.53.

References

Sotriffer CA, Sanschagrin P, Matter H, Klebe G: SFCscore: Scoring Functions for Affinity Prediction of Protein-ligand Complexes'. Proteins: Struct Funct Bioinf. 2008, 73: 395-419. 10.1002/prot.22058.
Article CAS Google Scholar
Ballester PJ, Mitchell JBO: A Machine Learning Approach to Predicting Protein-ligand Binding Affinity with Applications to Molecular Docking. Bioinformatics. 2010, 26: 1169-1175. 10.1093/bioinformatics/btq112.
Article CAS Google Scholar
Cheng T, Li X, Li Y, Liu Z, Wang R: Comparative Assessment of Scoring Functions on a Diverse Test Set. J Chem Inf Model. 2009, 49: 1079-1093. 10.1021/ci9000053.
Article CAS Google Scholar
Dunbar JB, Smith RD, Yang C-Y, Man-Un Ung P, Lexa KW, Khazanov N, et al: CSAR Benchmark Exercise of 2010: Selection of the Protein-ligand Complexes. J Chem Inf Model. 2011, 51: 2036-2046. 10.1021/ci200082t.
Article CAS Google Scholar
Kramer C, Gedeck P: Leave-cluster-out Cross-validation Is Appropriate for Scoring Functions Derived from Diverse Protein Data Sets. J Chem Inf Model. 2010, 50: 1961-1969. 10.1021/ci100264e.
Article CAS Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Pharmacy and Food Chemistry, University of Wuerzburg, Am Hubland, 97074, Wuerzburg, Germany
D Zilian & CA Sotriffer

Authors

D Zilian
View author publications
You can also search for this author in PubMed Google Scholar
CA Sotriffer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to CA Sotriffer.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Zilian, D., Sotriffer, C. Combining SFCscore with Random Forests leads to improved affinity prediction for protein-ligand complexes. J Cheminform 5 (Suppl 1), P27 (2013). https://doi.org/10.1186/1758-2946-5-S1-P27

Download citation

Published: 22 March 2013
DOI: https://doi.org/10.1186/1758-2946-5-S1-P27

8th German Conference on Chemoinformatics: 26 CIC-Workshop

Combining SFCscore with Random Forests leads to improved affinity prediction for protein-ligand complexes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Journal of Cheminformatics

Contact us

8th German Conference on Chemoinformatics: 26 CIC-Workshop

Combining SFCscore with Random Forests leads to improved affinity prediction for protein-ligand complexes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Journal of Cheminformatics

Contact us