Text-based similarity searching for hit- and lead-candidate identification
Journal of Cheminformatics volume 4, Article number: O12 (2012)
The Pharmacophore Alignment Search Tool (PhAST) is a string-based approach to virtual screening. Molecules are represented by linear sequences which describe their respective pattern of interaction possibilities. The problem of molecule linearization is tackled by applying Minimum Volume Embedding in combination with a Diffusion Kernel to the molecular graph [1, 2]. Linear representations are compared using global pairwise sequence alignment . PhAST exhibited enrichment capabilities comparable or superior to most common virtual screening approaches. Compound rankings were proven to be dissimilar to those of other virtual screening techniques. It was shown that emphasis on key interactions through the application of position specific weights in the alignment process significantly increases enrichment.
Significance of chemical similarity was determined in form of p-values of global alignment scores, calculated in an approach that was adapted from its original application to local sequence alignments of protein sequences utilizing Marcov chain Monte Carlo simulation . Bonferroni correction was used to correct p-values with respect to the size of the screening library .
PhAST was employed in two prospective applications: A screening for non-nucleoside analogue inhibitors of bacterial thymidine kinase yielded a hit with a distinct structural framework but only weak activity. Screenings for drugs that are not members of the NSAID (non-steroidal anti-inflammatory drug) class as modulators of gamma secretase resulted in a potent modulator with clear structural distinction from the reference compound.
Shaw R, Jebara T: Minimum Volume Embedding. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics: 21-24 March 2007; San Juan (Puerto Rico). Edited by: Meila M, Shen X. 2007, Omnipress, 460-467.
Smola AJ, Kondor RI: Kernels and Regularization on Graphs. Proceedings of the 16th Annual Conference on Computational Learning Theory and 7th Kernel Workshop: 24-27 August 2003; Washington DC. Edited by: Schölkopf B, Warmuth, M. 2003, Springer, 144-158.
Durbin R, Eddy S, Krogh A, Mitchison G: Alignment with affine gap scores. Biological Sequence Analysis. 1998, Cambridge University Press, 29-31.
Hartmann AK: Sampling rare events: statistics of local sequence alignments. Phys Rev E Stat Nonlin Soft Matter Phys. 2002, 65 (5 Pt 2): 056102-
Bonferroni CE: Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze. 1936, 8: 3-62.
About this article
Cite this article
Hähnke, V. Text-based similarity searching for hit- and lead-candidate identification. J Cheminform 4 (Suppl 1), O12 (2012). https://doi.org/10.1186/1758-2946-4-S1-O12