- Open Access
Synergy Maps: exploring compound combinations using network-based visualization
Journal of Cheminformaticsvolume 7, Article number: 36 (2015)
The phenomenon of super-additivity of biological response to compounds applied jointly, termed synergy, has the potential to provide many therapeutic benefits. Therefore, high throughput screening of compound combinations has recently received a great deal of attention. Large compound libraries and the feasibility of all-pairs screening can easily generate large, information-rich datasets. Previously, these datasets have been visualized using either a heat-map or a network approach—however these visualizations only partially represent the information encoded in the dataset.
A new visualization technique for pairwise combination screening data, termed “Synergy Maps”, is presented. In a Synergy Map, information about the synergistic interactions of compounds is integrated with information about their properties (chemical structure, physicochemical properties, bioactivity profiles) to produce a single visualization. As a result the relationships between compound and combination properties may be investigated simultaneously, and thus may afford insight into the synergy observed in the screen. An interactive web app implementation, available at http://richlewis42.github.io/synergy-maps, has been developed for public use, which may find use in navigating and filtering larger scale combination datasets. This tool is applied to a recent all-pairs dataset of anti-malarials, tested against Plasmodium falciparum, and a preliminary analysis is given as an example, illustrating the disproportionate synergism of histone deacetylase inhibitors previously described in literature, as well as suggesting new hypotheses for future investigation.
Synergy Maps improve the state of the art in compound combination visualization, by simultaneously representing individual compound properties and their interactions. The web-based tool allows straightforward exploration of combination data, and easier identification of correlations between compound properties and interactions.
Compound combinations have recently received much interest, as they afford a number of advantages as therapeutics compared to single agent treatments across a wide range of disease areas [1–4]. The phenomenon of super-additivity of the therapeutic effect of a combination, known as synergy, has the potential for improved pharmaceutical treatment options in terms of increased efficacy  and therapeutically relevant selectivity , whilst reducing the risk of toxicity  and side-effects . Two recent reviews are available on the topic [9, 10]. However, how to determine which compound combinations exhibit a desired form of synergy in a particular case is by no means clear, and the effect of multiple bioactive compounds in parallel is overall rather poorly understood.
Synergy in a combination is due to not purely additive interaction between the biological functions of the component compounds. Progress has been made in attempts to model synergy, usually by attempting to discover these interactions. For example, models incorporating flux balance analysis (FBA) have been used to correctly predict synergistic interactions in Saccharomyces cerevisiae . Enrichment analysis of molecular and pharmacological properties predicted several combinations to be synergistic, 69% of which were subsequently verified in the literature . Clinical side effect annotations have been used to predict effective combinations , and information from multiple domains have been integrated into a Probability Ensemble Approach to predict both efficacy and adverse effects of combinations with high predictive power . Various network approaches (such as the Stochastic Block Model  and the Prism algorithm [15, 16]) have been used to infer novel interactions from large incomplete drug interaction databases such as DrugBank [17, 18]. Biological network topologies of drug targets that lead to synergy have been identified through network modelling , and mechanisms of action of many known non-additive drug combinations have been deduced . However, these models usually require heavily annotated data (such as with ATC codes, protein targets or side effect data)—a complete understanding of the origins and repercussions of synergy has not yet in general been achieved, and thus significant further work is needed, both experimental and in silico.
To this end, an experimental strategy for measuring synergy has been assaying all pairwise combinations for a relatively small compound library. A recently published example of this type of dataset is the DREAM Drug Sensitivity Challenge (subchallenge 2) , in which all combinations of 14 compounds were tested on the LY3 lymphoma cell line. The degree of synergy for each combination was indicated by the difference in growth inhibition observed by experiment from that predicted under the Bliss Independence model . Other all-pairs combinatorial datasets include a 90 compound set (consisting of drugs and probes) assayed against the HCT116 colon cancer cell line , a set of 11 anticancer drugs tested also tested against HCT116 , a set 31 antifungal compounds assayed against S. cerevisiae [24, 25], and an assay of 22 antibiotics against Escherichia coli . Each of these datasets measure dose response surfaces , and derive synergy metrics from those surfaces (see original papers for examples). Whilst this is currently a reasonable selection in terms of dataset size, compound variety and assay type, there is potential for many more experiments—an exciting prospect is an upcoming National Cancer Institute Combination Screen of approximately 100 anti cancer drugs tested pairwise against the 59 NCI-60 cell lines .
Visualizing large numbers of combinations
The influx of this kind of combination data provides a new opportunity to analysts. Conventionally, a first step in a data focused study is an exploratory data analysis, principally focusing on informative visualization of any data collected with the goal of identifying major trends . This can be challenging, due to structure of combination data, and the geometric scaling of possible combinations with respect to compound library size . Two major approaches have been utilized to visualize combination data in the literature: heatmaps and networks. Heatmaps (see Fig. 1) are featured extensively in the literature [11, 15, 16, 21, 23, 25, 29]. Compounds of the dataset are represented as rows and columns, with their corresponding combinations positioned at the intersecting elements. A color map [11, 15, 16, 21] or gradient may be used to indicate direction and/or degree of non-additivity for each combination. The compounds may be ordered according to a particular physicochemical property, grouped by targeted protein  or pathway , hierarchically clustered according to synergy profile  or just alphabetically .
Heatmaps are useful as an uncluttered static presentation of data. It is possible to identify disproportionately synergistic compounds and also compounds that behave similarly if clustering such as in Cokol et al.  and Fig. 1 is applied. Additionally, relevant dose–response matrices may be superimposed [11, 23, 25] to reveal different shapes of response surfaces, which may encode information of underlying biological network topology [11, 30, 31]. A drawback is that little information about the actual compounds are encoded—they may be ordered according to a physicochemical property, but this is limits further possible insight into the dataset. Furthermore, for a large dataset (for example over a hundred compounds), such as those produced using high-throughput techniques , the heatmap quickly becomes cluttered and individual compounds become difficult to identify.
Network representation (see Fig. 2) for all pairs combination data is also popular [3, 15, 16, 24, 25, 32]—nodes correspond to compounds, and edges to combinations, connecting their components. Edges may be coloured according to sign, and weighted according to degree of synergy. A graph layout algorithm, such as circular  or force-directed  is usually employed to position nodes. This type of representation has a tendency to become overcrowded, and threshold values may be required to limit the number of edges. Despite this, networks have the potential to scale better with dataset size than heatmaps as compounds are positioned in two dimensions rather than along a single one. A notable shortcoming (shared with heatmaps) is that the nature of the compounds in the dataset is not simultaneously well represented: it is only possible to show a few properties, through node color, size or superimposing numbers. An example of this may be found in a recent publication  where the cLogP of compounds were superimposed over the relevant node, and ordered in a circle to illustrate the increased potential of lipohilic compounds to participate in synergy. Whilst this may offer insight for the specific publication, it seems unlikely that a single property will satisfactorily explain synergistic behavior for all datasets.
Hence, an improvement in chemical property representation for the visualization of compound combination screens is still very much desirable, which is the objective of the current work.
Chemical property visualization
Compounds have traditionally been represented under a descriptor space using a dimensionality reduction algorithm as a scatter plot; a common example is Principle Component Analysis (PCA)  applied to physicochemical descriptors. A state-of-the-art equivalent might be the use of Student’s t-distributed Stochastic Neighbour Embedding (t-SNE)  on proprietary descriptors . In this way, compounds may be easily compared according to their properties or features; adjacent compounds tend to share properties and behaviour in the descriptor space in question.
In this communication, we introduce a novel type of visualization for combination datasets, named “Synergy Maps”. Synergy Maps combine network and descriptor space representation to yield an information dense presentation of a combination dataset. Specifically, the approach positions the nodes of a drug–drug interaction graph in two-dimensional space using the techniques referred to in the previous section; in this way, synergistic interactions can be straightforwardly related to trends in compound properties, and thus hypotheses for the origins of the synergy might be more quickly proposed. We also introduce an interactive implementation, which enables the generation of synergy maps for novel combination datasets, and allows for exploration of synergy under different spaces, metrics and datasets. Source code is provided as a GitHub repository.
As an example, we produce synergy maps for a combination dataset of 56 antimalarials tested against P. falciparum, and detail a quick analysis of the resultant maps.
An input dataset should consist of compound data in the form of a Structure-Data File (SDF), and data associated with their combinations (including calculated synergy metrics) in the form of a comma separated values (CSV) file (examples provided with the repository). A script is then written (or a default one used), specifying the descriptors, dimensionality reduction techniques and synergy metrics to employ in generating the processed file (example scripts provided with the repository).
A previously collected all pairs combination dataset of 56 compounds tested against P. falciparum  was selected as an example dataset to concretely illustrate the technique. Each combination was tested in a 6 × 6 dose–response matrix, varying the concentration of each compound on each axis. The change in growth inhibition was measured at each dose combination, yielding a response surface. From this, 9 different synergy metrics  were evaluated for all 1,540 combinations. These were then preprocessed into the appropriate input format.
Compounds were initially standardized using Chemaxon Standardizer , to ensure a consistent representation of compounds. Descriptors for each compound were calculated for physicochemical, structural and biological spaces, each of which may be of relevance to synergy (Table 1). Firstly, all available physicochemical descriptors were calculated using PaDEL . Secondly, Morgan fingerprints of radius 2, and folded to 2,048 bits were generated as structural descriptors using RDKit . Finally, 1080 Naive Bayes binary models, trained using ChEMBL  bioactivities, were used to predict likely (human) protein targets for each structure (notably, the organism of interest is not human for the example, but these descriptors act as reasonable generic biological descriptors ).
The dimensionality of each space was then reduced to two dimensions using three different, yet complementary techniques (Table 2). Principal Component Analysis (PCA) and MultiDimensional Scaling (MDS) were run using default parameters in scikit-learn , and student’s t-distributed Stochastic Neighbor Embedding (t-SNE), was employed using a perplexity of 40. This yielded nine sets of coordinates per compound.
Due to the relatively small chemical space spanned by the 56 compounds, an additional 175 diverse compounds from MIPE  were temporarily added to the dataset, to diversify the space covered, and so allow for a better and more consistent dimensionality reduction step. This may not be necessary for a larger and more diverse compound set, but in practice made the resultant plots more reproducible and transferable (this was especially the case for t-SNE, which has a non-convex objective function, and thus converges to different solutions each time it is run. It also allowed for a higher perplexity (roughly the expected density of neighbors) to be set, which prevents artificially large gaps opening in the dataset).
The combinations were filtered for quality: firstly through the Quality Control score (removing those with a score of above 4) of the data producer , then by removing extreme values (top and bottom 2.5% of values sorted by Gamma) on a case by case basis, by checking whether their surfaces appeared unlikely to be genuine (for an example, see Fig. 4). The synergy metrics provided were then standardized, such that an increase in synergy was represented by an increase in magnitude, and a negative sign used for antagonism for those metrics for which it was defined (Table 3). The processed data was then outputted as a JSON formatted file.
The resultant networks generated for the example are shown in Fig. 5, and an annotated version of t-SNE applied to the Bayes Affinity fingerprints with pGamma (negative log of the Gamma metric from Cokol et al. ) synergy values is shown in Fig. 6. This representation may allow for the most interesting observations to be made: compounds that are predicted to modulate similar protein targets, and thus potentially share similar modes of action, are clustered together; if similar interactions are observed consistently between clusters, the underlying modes of action of each cluster might be hypothesized to interact as the cause of the synergy.
Results and discussion
Whilst the purpose of this paper is simply to introduce a novel visualization technique rather than analyze the resulting networks, it is possible to illustrate a few observations that may be made; these could be investigated further in subsequent assays. Firstly, we can see that compounds annotated as histone deacetylase (HDAC) inhibitors, which are clustered in the north-east of the Fig. 6, appear to be the most likely compounds in the dataset to be synergistic, and specifically with the compounds in the center (these are annotated with diverse modes of action, but often were kinase or phosphatase inhibitors). This property has been reported in the literature, where the HDAC inhibitor trichostatin A was found to interact synergistically with geldanamycin, an Hsp90 inhibitor , in P. falciparum. Interestingly, NVP-AUY922, an Hsp90 inhibitor included in the dataset, clustered to the centre; this is likely where geldanamycin would also be placed due to their similar annotated modes of action. This result would be in agreement with the observed trend and suggest that the method might yield some predictive power for unknown combinations. In contrast to this, PI3K inhibitors are shown to exhibit in general disproportionately more antagonism with the other compounds in the dataset. Whilst these observations are by no means reliable by themselves, they may form a basis for further study, and provide an example in how this type of visualization may prove a useful first step in the analysis of pairwise combination data.
In the authors’ opinion, the observations described above are much less clear in the heatmap or network visualization of the data, illustrating the strength of synergy maps. However there are some problems that arise, principally in ‘over fitting’ an interpretation—trends may appear at random, and as such ‘control’ visualizations should be consulted, to provide a reality check. These can be done by scrambling compound or combination data, or using random feature representations to generate compound coordinates, as shown in Fig. 8. Observed trends should certainly be treated with healthy skepticism, although it is likely that with the growth of high quality datasets, these chance correlations will lessen and more may be gained from the approach.
Synergy Maps, a novel method for visualization of a combination data set was presented, integrating combination-based information in a network, with compound-based information using a dimensionality reduced scatter-plot. An accompanying interactive visualization tool was also introduced, which enables fast and simple exploration and presentation of combination data. An all-pairs combination dataset assayed against P. falciparum was analyzed as an example, identifying several properties already reported in the literature.
Availability and requirements
Project name: Synergy Maps.
Project home page: https://www.github.com/richlewis42/synergy-maps.
Operating system(s): Platform independent/Google Chrome.
flux balance analysis
Principal Components Analysis
Multi Dimensional Scaling
t-distributed Stochastic Neighbour Embedding
comma separated values
Yuan S, Wang F, Wang J, Huang P, Chen G, Zhang H et al (2012) effective elimination of cancer stem cells by a novel drug combination strategy. Stem Cells 31:23–34
Hill JA, Nislow C, Ammar R, Torti D, Cowen LE (2013) Genetic and genomic architecture of the evolution of resistance to antifungal drug combinations. PLoS Genetics 9:e1003390
Tan X, Hu L, Luquette LJ, Gao G, Liu Y, Qu H et al (2012) Systematic identification of synergistic drug pairs targeting HIV. Nat Biotechnol 30:1125–1130
Katouli AA, Komarova NL (2010) Optimizing combination therapies with existing and future CML drugs. PLoS One 5:e12300
Berenbaum MC (1989) What is synergy? Pharmacol Rev 41:93–141
Krueger AS, Avery W, Heilbut AM, Johansen LM, Price ER, Rickles RJ et al (2009) Synergistic drug combinations tend to improve therapeutically relevant selectivity. Nat Biotechnol 27:659–666
Greco WR, Bravo G, Parsons JC (1995) The search for synergy: a critical review from a response surface perspective. Pharmacol Rev 47:331–385
Wang Y-Y, Xu K-J, Song J, Zhao X-M (2012) Exploring drug combinations in genetic interaction network. BMC Bioinform 13(Suppl 7):S7
Ryall RA, Tan AC (2015) Systems biology approaches for advancing the discovery of effective drug combinations. J Cheminform 7:7
Bulusu KC, Guha R, Mason DJ, Lewis RPI, Muratov EN, Motamedi YK et al (2015) Modelling of compound combination effects and applications to efficacy and toxicity: state-of-the-art, challenges and perspectives. Drug Discov Today (in press)
Lehár J, Zimmermann GR, Krueger AS, Molnar RA, Ledell JT, Heilbut AM et al (2007) Chemical combination effects predict connectivity in biological systems. Mol Syst Biol 3:80
Zhao X-M, Iskar M, Zeller G, Kuhn M, Van Noort V, Bork P (2011) Prediction of drug combinations by integrating molecular and pharmacological data. PLoS Comput Biol 7:e1002323
Huang H, Zhang P, Qu XA, Sanseau P, Yang L (2014) Systematic prediction of drug combinations based on clinical side-effects. Sci Rep 4:7160
Li P, Huang C, Fu Y, Wang J, Wu Z, Ru J et al. (2015) Large-scale exploration and analysis of drug combinations. Bioinformatics 31(12):2007–2016
Guimerà R, Sales-Pardo M (2013) A network inference method for large-scale unsupervised identification of novel drug-drug interactions. PLoS Comput Biol 9:e1003374
Yeh P, Tschumi AI, Kishony R (2006) Functional classification of drugs by properties of their pairwise interactions. Nat Genet 38:489–494
Guo AC, Knox C, Wishart DS, Pon A, Law V, Banco K (2010) DrugBank 3.0: a comprehensive resource for ‘Omics’ research on drugs. Nucleic Acids Res 39(Database):D1035–D1041
Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D et al (2007) A knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 36(Database):D901–D906
Yin N, Ma W, Pei J, Ouyang Q, Tang C, Lai L (2014) Synergistic and antagonistic drug combinations depend on network topology. PLoS One 9:e3960
Jia J, Zhu F, Ma X, Cao Z, Cao ZW, Li Y et al (2009) Mechanisms of drug combinations: interaction and network perspectives. Nat Rev Drug Discov 8:111–128
Bansal M, Yang J, Karan C, Menden MP, Costello JC, Tang H et al (2014) A community computational challenge to predict the activity of pairs of compounds. Nat Biotechnol 32(12):1213–1222
Bliss CI (1939) The toxicity of poisons applied jointly. Ann Appl Biol 26:585–615
Severyn B, Liehr RA, Wolicki A, Nguyen KH, Hudak EM, Ferrer M et al (2011) Parsimonious discovery of synergistic drug combinations. ACS Chem Biol 6:1391–1398
Yilancioglu K, Weinstein ZB, Meydan C, Akhmetov A, Toprak I, Durmaz A et al. (2014) Target-independent prediction of drug synergies using only drug lipophilicity. J Chem Inform Model 54(8):2286–2293
Cokol M, Chua HN, Tasan M, Mutlu B, Weinstein ZB, Suzuki Y et al (2011) Systematic exploration of synergistic drug pairs. Mol Syst Biol 7:544
Holbeck S, Collins JM, Doroshow JH (2012) 27 NCI-60 combination screening matrix of approved anticancer drugs. Eur J Cancer 48(Suppl 6):11
Tukey JW (1977) Exploratory data analysis. Addison-Wesley
Tornero-Velez R, Egeghy PP, Cohen Hubal EA (2011) Biogeographical analysis of chemical co-occurrence data to identify priorities for mixtures research. Risk Anal 32:224–236
Zimmermann GR, Keith CT, Lehár J (2007) Multi-target therapeutics: when the whole is greater than the sum of the parts. Drug Discov Today 12:34–42
Yeh P, Kishony R (2007) Networks from drug-drug surfaces. Mol Syst Biol 3:85
Stockwell BR, Giaever G, Nislow C, Lehár J (2008) Combination chemical genetics. Nat Chem Biol 4:674–681
Yeh PJ, Hegreness MJ, Aiden AP, Kishony R (2009) Drug interactions and the evolution of antibiotic resistance. Nat Rev Microbiol 7:460–466
Baur M, Brandes U (2005) Crossing Reduction in Circular Layouts. In: Graph-theoretic concepts in computer science. Springer Berlin Heidelberg, Berlin, Heidelberg, pp 332–343
Fruchterman TMJ, Reingold EM (1991) Graph drawing by force-directed placement. Softw Pract Exp 21:1129–1164
Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philos Mag Ser 6(2):559–572
van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605
(2012) Visualization Prospect—Merck Molecular Activity Challenge. https://www.kaggle.com/c/MerckActivity/details/visualization-prospect. Accessed 29 July 2015
Guha R, Mott BT, Eastman RT, Sherlach KS, Siriwardana A, Shinn P (2015) High-throughput matrix screening identifies antimalarial drug combinations. Sci Rep (in review)
Griner LAM, Guha R, Shinn P, Young RM, Keller JM, Liu D et al. (2014) High-throughput combinatorial screening identifies drugs that cooperate with ibrutinib to kill activated B-celllike diffuse large B-cell lymphoma cells. Proc Natl Acad Sci 111(6):2349–2354
ChemAxon: Standardizer. https://www.chemaxon.com/products/standardizer/. Accessed 29 July 2015
Yap CW (2011) PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32:1466–1474
RDKit: Open-source cheminformatics. http://www.rdkit.org. Accessed 29 July 2015
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A et al (2011) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40:D1100–D1107
Bender A, Jenkins JL, Glick M, Zhan D, Nettles JH, Davies JW (2006) “Bayes affinity fingerprints” Improve retrieval rates in virtual screening and define orthogonal bioactivity space: when are multitarget drugs a feasible concept? J Chem Inf Model 46:2445–2456
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Pallavi R, Roy N, Nageshan RK, Talukdar P, Pavithra SR, Reddy R et al (2010) Heat shock protein 90 as a drug target against protozoan infections: biochemical characterization of HSP90 from Plasmodium falciparum and Trypanosoma evansi and evaluation of its inhibitor as a candidate drug. J Biol Chem 285:37964–37975
Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9:99–104
Christmas R, Avila-Campillo I, Bolouri H, Schwikowski B, Anderson M, Kelley R et al (2005) Cytoscape: a software environment for integrated models of biomolecular interaction networks. AACR Educ Book 2005:12
Morgan HL (1965) The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J Chem Doc 5:107–113
RG produced the data, and carried out the preliminary analysis. RL conceived of and designed the software, carried out the data analysis and drafted the manuscript. AB, TK and RG helped draft the paper. All authors read and approved the final manuscript.
Azedine Zoufir, Yasaman Kalandar Motamedi, Dan Mason and Krishna Bulusu are thanked for their advice and work in the area, and the rest of the Bender Group for helpful feedback on the layout and design of the software. RL thanks EPSRC for funding. TK is supported by a fellowship in computational biology at The Genome Analysis Centre, in partnership with the Institute of Food Research, and strategically supported by BBSRC. AB thanks the European Research Commission for funding (ERC Starting Grant 2013 MIXTURE).
Compliance with ethical guidelines
Competing interests The authors declare that they have no competing interests.