Skip to main content

Searching substructures in fragment spaces


Fragment spaces (FSs) are an elegant way to model a large or even infinite number of chemical compounds and their synthetic accessibility. A FS consists of molecular fragments and a set of rules defining how fragments can be combined to products. In virtual screening experiments, FSs might include products with undesired functional groups or inadequate central building blocks. The recognition of such products, especially when they span over multiple fragments, would require their explicit construction from the FS. Due to the generally huge number of possible products in an FS, the complete enumeration is undesired or even impossible. Therefore, algorithms that perform substructure search in FSs must be able to process fragments and joining rules rather than complete molecules. Even though some algorithms that work in FSs exist [1, 2], a method that excludes undesired products via substructure definition from a FS is still missing.


We present and compare two algorithms to modify an FS such that no possible product can include a given functional group or substructure. The methods utilize a search procedure based on the Ullmann [3] respectively the VF2 algorithm [4] for subgraph isomorphism. Thereby, we find substructures that are present inside fragments or would be formed by joining two fragments. After the identification of such fragments, they are either removed from the FS or their joining rules are altered in a way that a formation of the substructure becomes impossible.


The algorithms are tested on the BRICS fragment space [1]. We exclude substructures described by SMARTS patterns that where collected from literature [5]. The experiments show that the VF2 approach is superior in running time.


  1. Degen J, Wegscheid-Gerlach C, Zaliani A, Rarey M: On the art of compiling and using ’drug-like’ chemical fragment spaces. ChemMedChem. 2008, 3 (10): 1503-1507. 10.1002/cmdc.200800178.

    Article  CAS  Google Scholar 

  2. Rarey M, Stahl M: Similarity searching in large combinatorial chemistry spaces. J Comput Aided Mol Des. 2001, 15 (6): 497-520. 10.1023/A:1011144622059.

    Article  CAS  Google Scholar 

  3. Ullmann JR: An algorithm for subgraph isomorphism. J Assoc Comput Mach. 1976, 23: 31-42.

    Article  Google Scholar 

  4. Cordella LP, Foggia P, Sansone C, Vento M: A (sub)graph isomorphism algorithm for matching large graphs. IEEE T-PAMI. 2004, 26 (10): 1367-1372.

    Article  Google Scholar 

  5. Schomburg K, Ehrlich H-C, Stierand K, Rarey M: From structure diagrams to visual chemical patterns. J Chem Inf Model. 2010,

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to H-C Ehrlich.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Ehrlich, HC., Rarey, M. Searching substructures in fragment spaces. J Cheminform 3 (Suppl 1), P11 (2011).

Download citation

  • Published:

  • DOI: