Skip to main content
  • Poster presentation
  • Open access
  • Published:

Extended graph-based models for enhanced similarity retrieval in Cavbase

The problem of estimating the similarity between molecular structures is often tackled by means of graph-based approaches, using graphs for structure representation and measures based on the maximum common subgraph as similarity metrics. In the case of protein binding sites as molecular structures, however, where the graphs can be very large, the computation of these measures may easily become infeasible or at least unacceptably slow.

To this end, Cavbase [1, 2] was developed, a database for the automatic detection and storage of putative binding sites on the protein surfaces. Cavbase assigns so-called pseudocenters to the cavity-flanking amino acids, which characterize their physicochemical properties with respect to molecular recognition. On the one side, this representation leads to smaller and more generic representation of a binding site. On the other side, it comes with a loss of information, which is usually compensated by performing further calculations based on additional data. These steps, however, are most often computationally quite demanding, making the whole approach again very slow.

The main drawback of a graph-based model solely based on pseudocenters is the loss of information about the shape of protein surface. In this study, we propose an extended modeling formalism that leads to graphs of the same size, but containing considerably more information. More specifically, additional descriptors of the surface characteristics are extracted from the surface points stored in Cavbase. These properties are included as attributes of the nodes of the graph, which leads to a gain of information and allows for more accurate comparisons between different structures.


  1. Schmitt S, Kuhn D, Klebe G: A New Method to Detect Related Function Among Proteins Independent of Sequence and Fold Homology. J Mol Biol. 2002, 323: 387-406. 10.1016/S0022-2836(02)00811-2.

    Article  CAS  Google Scholar 

  2. Hendlich M, Rippmann F, Barnickel G: LIGSITE: Automatic and efficient detection of potential small molecule-binding sites in proteins. J Mol Graph Model. 1997, 15: 359-363. 10.1016/S1093-3263(98)00002-3.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Timo Krotzky.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Krotzky, T., Fober, T., Mernberger, M. et al. Extended graph-based models for enhanced similarity retrieval in Cavbase. J Cheminform 5 (Suppl 1), P29 (2013).

Download citation

  • Published:

  • DOI: