Skip to main content

Limits to molecular matched-pair analysis: the experimental uncertainty case

Matched-Molecular Pair (MMP) analysis has recently emerged as a data analysis technique in medicinal chemistry. It quickly gained scientific momentum because it tackles key questions in lead optimization. In contrast to classical global QSAR models that attempt to predict the absolute numbers of ADME (absorption, distribution, metabolism, excretion) and toxicological properties, MMP analyses predict the difference in (bio-) chemical properties that can be expected due to small chemical modifications to lead structures, with a much smaller and well-controlled error than global QSAR models.

The power of MMP analysis depends on the number of previously documented similar molecular transformations, whereas the definition of chemical similarity plays a key role: the more generous the definition of similarity of the anchoring region, the more examples are available. The more strict the definition of similarity, the lower the variability and thus the clearer the effect on ADME-Tox parameters, but also the less data pairs will be available [1].

The (bio-) chemical effect and the significance of the results depends on the experimental uncertainty (=noise) in the data. There is a clear mathematical association between the noise level and the minimum activity difference necessary for statistical significance. Here we demonstrate how the experimental uncertainty and variability[2, 3] affect Matched Molecular Pair Analysis. It can be shown that for small sample sizes (Context-specific MMPs), the activity differences have to be very large in order to be statistically significant. A full equation for the estimation of minimum significant activity difference, depending on the number of samples, standard deviation of the measurements and the true variance of the biochemical effect is developed. The influence of consistency of assay setups can directly be quantified via the variability and practical consequences for MMP analysis will be presented.


  1. 1.

    Papadatos G, Alkarouri M, Gillet VJ, Willett P, Kadirkamanathan V, Luscombe CN, Bravi G, Richmond NJ, Pickett SD, Hussain J, Pritchard JM, Cooper AWJ, Macdonald SJF: Lead Optimization Using Matched Molecular Pairs: Inclusion of Contextual Information for Enhanced Prediction of hERG Inhibition, Solubility, and Lipophilicity. J Chem Inf Model. 2010, 50: 1872-1886. 10.1021/ci100258p.

    CAS  Article  Google Scholar 

  2. 2.

    Kramer C, Kalliokoski T, Gedeck P, Vulpetti A: The Experimental Uncertainty of Heterogeneous public Ki Data. J Med Chem. 2012, 55: 5165-5173. 10.1021/jm300131x.

    CAS  Article  Google Scholar 

  3. 3.

    Kalliokoski T, Kramer C, Vulpetti A, Gedeck P: Comparability of Mixed IC50 Data – A Statistical Analysis. PLoS ONE. 2013, 8 (4): e61007-10.1371/journal.pone.0061007.

    CAS  Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Christian Kramer.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Cite this article

Kramer, C., Liedl, K. Limits to molecular matched-pair analysis: the experimental uncertainty case. J Cheminform 6, O6 (2014).

Download citation


  • Experimental Uncertainty
  • Activity Difference
  • True Variance
  • Data Analysis Technique
  • Lead Optimization