Discriminating agonist and antagonist ligands of the nuclear receptors using 3D-pharmacophores

Lagarde, Nathalie; Delahaye, Solenne; Zagury, Jean-François; Montes, Matthieu

doi:10.1186/s13321-016-0154-2

Research article
Open access
Published: 06 September 2016

Discriminating agonist and antagonist ligands of the nuclear receptors using 3D-pharmacophores

Nathalie Lagarde¹,
Solenne Delahaye¹,
Jean-François Zagury¹ &
…
Matthieu Montes ORCID: orcid.org/0000-0001-5921-460X¹

Journal of Cheminformatics volume 8, Article number: 43 (2016) Cite this article

3552 Accesses
9 Citations
4 Altmetric
Metrics details

Abstract

Nuclear receptors (NRs) constitute an important class of therapeutic targets. We evaluated the performance of 3D structure-based and ligand-based pharmacophore models in predicting the pharmacological profile of NRs ligands using the NRLiSt BDB database. We could generate selective pharmacophores for agonist and antagonist ligands and we found that the best performances were obtained by combining the structure-based and the ligand-based approaches. The combination of pharmacophores that were generated allowed to cover most of the chemical space of the NRLiSt BDB datasets. By screening the whole NRLiSt BDB on our 3D pharmacophores, we demonstrated their selectivity towards their dedicated NRs ligands. The 3D pharmacophores herein presented can thus be used as a predictor of the pharmacological activity of NRs ligands.

Background

Nuclear receptors (NRs) are involved in a wide range of physiological key functions. They are potential targets for numerous diseases and constitute an important class of therapeutic targets [1, 2]. NRs are transcription factors naturally switched on and off by small-molecule hormones, and artificially by synthetic ligands. Taking advantage of the biological potency of the NRs, a large amount of compounds has been proposed to modulate their activity and some of them are still marketed [3, 4]. The NRs ligands can be classified according to their pharmacological profiles, the two main classes being agonist and antagonist ligands. These two classes of compounds act through the binding to a NR and the activation (agonist ligands) or the inhibition (antagonist ligands) of its activity. The drug discovery process is thus not limited to the search of the best ligand of a given target, but consists in the search of a ligand with a pharmacological profile that his compatible to the required activity. In this context, the ability to predict the agonist or antagonist behaviour of a NR ligand is of major importance. In recent years, virtual screening methods have proven their ability to predict the activity of small compounds [5–7] and can be used to predict the pharmacological profile of NRs ligands. Numerous ligand-based (LB) and structure-based (SB) virtual screening studies dedicated to NRs were conducted but only few focused on the agonism/antagonism issues [8–17]. Despite these several prediction attempts and the elucidation of the molecular bases of agonism and antagonism [18–21], discriminating agonist from antagonist ligands based on their sole structure remains a challenge. In this study, we describe a 3D pharmacophore modeling study performed on 27 NRs, with the aim to provide separate and selective agonist and antagonist pharmacophores for each NR. To our knowledge, this is the first large-scale study conducted to predict the agonist and antagonist behaviour of NRs ligands using a 3D pharmacophore modeling method. 3D pharmacophores are nowadays widely used as filters in virtual screening protocols and several studies successfully identified new NRs ligands using pharmacophore models [22–30]. Pharmacophore models display two main advantages: reduced computational times associated to the simplified pharmacophoric representations and a large diversity of potential hits with scaffolds and functional groups distinct to the original ligands [31, 32]. To design our study, we used the 27 NRLiSt BDB datasets [33]. For each dataset, we created both SB and LB 3D pharmacophores and compared the ability of these two approaches to generate agonist selective pharmacophores and antagonist selective pharmacophores covering the whole NRLiSt BDB ligands chemical space. We also studied the performance obtained using a combination of SB and LB pharmacophores and analyzed the composition and the selectivity against all NRs datasets of these combinations. In the present study, we describe our attempt to develop selective pharmacophores for agonist or antagonist ligands that could be used to predict the pharmacological activity of NRs.

Methods

Nuclear receptors ligands and structures benchmarking DataBase (NRLiSt BDB)

The NRLiSt BDB [33] is a freely available benchmarking database for both SB and LB methods evaluation and dedicated to the NRs. The NRLiSt BDB presents separated agonist and antagonist datasets for the 27 targets (out of the 48 known NRs) for which more than one agonist ligand, one antagonist ligand, and at least one experimental structure was available. All of the ligands found to be agonist or antagonist in the scientific literature are provided in two separated datasets and all of the available human holo PDB structures (except for RXR_gamma, for which only one apo structure was available). A total of 7853 actives, 458,981 decoys, and 339 structures are divided into 54 datasets. The NRLiSt BDB was downloaded from the Web site http://nrlist.drugdesign.fr.

LigandScout

3D pharmacophores were generated using the software LigandScout [34] (version 4.0) in SB and LB approaches.

Structure-based approach

3D SB pharmacophores were automatically generated using the PDB structures included in the NRLiSt BDB. This approach is only possible with holo structures, thus no RXR gamma 3D SB pharmacophore could be computed. In this approach, the LigandScout algorithm tags the key features of the ligand that are interacting with the residues of the receptor: aromatic ring, hydrophobic area, hydrogen bond donor or acceptor, negative or positive ionisable atom and metal binding location. To complete the pharmacophore, an ensemble of exclusion volume spheres is generated to represent the shape of the active site.

Ligand-based approach

All ligands of each dataset were clustered with LigandScout using default settings except for the cluster distance that was adjusted for each NR to obtain balanced clusters. For each cluster, a 3D LB pharmacophore was generated using the “merged feature pharmacophore approach” with the number of omitted features for a given merged pharmacophore set to 4 and optional partially matching features with a threshold set to 10 %. In this approach, all the features observed in each ligand of the training datasets are identified, scored and removed according to the threshold number of omitted features. We chose to enable the creation of exclusion volume spheres around the alignment of ligands. In some cases, we added manually exclusion volume spheres to remove decoys compounds since inactive compounds can map all the pharmacophore require features, their inactivity being explained by steric clashes with the binding site [35, 36]. For each pharmacophore, the ligands of the cluster used to generate the pharmacophore constituted the training set and the test set was formed by all agonist ligands and all antagonist ligands of the corresponding NR. During the pharmacophore generation, the ligands of the training set were automatically aligned with the LigandScout pharmacophore-based alignment algorithm [37].

Model optimization protocol

The generated 3D pharmacophores were used to screen the NRs datasets. All of the ligands provided in SMILES format in the NRLiSt BDB were converted in .ldb format using the idbgen tool provided with LigandScout with the omega-fast option. Two databases were used for each screening, a screening database of active compounds and a screening database of decoys. Agonist ligands were used as decoys for antagonist pharmacophores and reciprocally antagonist ligands were used as decoys for agonist pharmacophores. We developed an original model optimization protocol for this study (Fig. 1), to sequentially refine the pharmacophore models according to several literature recommendations [32, 38, 39]. For each pharmacophore, a first screening was made with LigandScout default settings and particularly the Max. number of omitted features set to 0. If the hits retrieved with this first screening contained both agonist and antagonist ligands, the pharmacophore was not validated and was not retained. If only agonist or antagonist ligands were retrieved in this first screening, the pharmacophore was validated and a second screening was performed with this pharmacophore, but with the Max. number of omitted features parameter set to 1. This second screening was carried out to identify possible non-essential pharmacophore features, i.e. features that can be disabled to obtain less stringent pharmacophores able to retrieve more active ligands (agonist ligands when using agonist datasets or antagonist ligands when using antagonists datasets), but no decoys (antagonists when using agonist datasets and agonists when using antagonist datasets). When a non-essential pharmacophore feature was identified, a third screening was performed with the non-essential feature marked as disabled and the Max. number of omitted features parameter set to 0. If the hits retrieved with this third screening were both agonist and antagonist ligands, this second pharmacophore was not validated and another round of identification of non-essential features was performed. If only active ligands were retrieved, the pharmacophore was validated and other non-essential features were studied. This protocol was applied to each pharmacophore until 3 pharmacophore features were retained or until no non-essential feature could be identified.

Combination of SB pharmacophores, combination of LB pharmacophores and combination of SBLB pharmacophores

Using the SB approach, for each NR, all the selective pharmacophores generated, i.e. all pharmacophores that retrieved only agonist ligands or antagonist ligands, were gathered into two groups: “SB agonist selective pharmacophores” and “SB antagonist selective pharmacophores”; redundant pharmacophores were removed. Similarly, all the selective pharmacophores obtained with the LB approach for each NR were gathered into two groups: “LB agonist selective pharmacophores” and “LB antagonist selective pharmacophores”; redundant pharmacophores were removed. Finally, the SB and LB selective pharmacophores previously generated were gathered in two pharmacophore ensembles: “SBLB agonist selective pharmacophores” and “SBLB antagonist selective pharmacophores”; redundant pharmacophores were removed.

Redundant pharmacophores are pharmacophores that could be removed without decreasing the recall of the set of combined pharmacophores i.e. pharmacophores that only retrieved ligands that were also retrieved with other pharmacophores of the set. To remove these redundant pharmacophores, all generated pharmacophores were ranked according to the number of hits they retrieved. Then, each pharmacophore was removed sequentially, starting from the pharmacophore associated with the smallest number of hits. For each removal, the impact on the recall was evaluated. If the recall was not affected, the pharmacophore was dismissed and in the opposite, if the recall decreased, the pharmacophore was conserved.

Performance metrics

All the graphs were produced with the statistical and graphical tool R (http://www.r-project.org/). The ggplot2 package was used to produce the barplot of Figs. 2, 3, 4, 5 and 6. The corrplot and RColorBrewer packages were used to produce the graph of pharmacophores selectivity using the recall (R) value (Fig. 9). For each dataset, the recall (R), the specificity (Sp) and the Matthew’s correlation coefficient (MCC) were computed as follows:

$$R = \frac{TP}{TP + FN};\quad Sp = \frac{TN}{TN + FP};\quad MCC = \frac{TP \times TN - FP \times FN}{{\sqrt {\left( {TP + FN} \right)\left( {TN + FP} \right)\left( {TP + FP} \right)\left( {TN + FN} \right)} }}$$

with TP the number of true positives (number of active compounds of the dataset retrieved as screening hits), FN the number of false negatives (number of active compounds of the dataset not retrieved as screening hits), TN the number of true negatives (number of inactive compounds of the dataset not retrieved as screening hits), FP (number of inactive compounds of the dataset retrieved as screening hits). As we chose to generate only selective agonist or antagonist pharmacophores, the number of FP was always equal to 0 and thus the Sp value was always equal to 1. Similarly, in the SB approach, when the number of TP was equal to 0, it was not possible to compute the MCC value (because the denominator value is equal to 0), and the MCC value was qualified as not determined (ND).

Results

Structure-based pharmacophore modeling

338 3D SB pharmacophores were generated from the 339 PDB structures included in the NRLiSt BDB. The protein structures included in the NRLiSt BDB are classified according to the pharmacological profile of the ligand bound in the active site: 266 agonist-bound structures, 17 antagonist-bound structures, 55 other-bound structures (partial agonists, modulators, inverse agonists etc.). Since only 1 apo structure was available for RXR_gamma, no 3D SB pharmacophore could be generated and this NR was excluded for this part of the study. For respectively 25 and 10 out the 26 remaining NRs, at least one agonist-bound or one antagonist-bound structure was available. Using the screening protocol described in the “Methods” section, we succeeded in generated at least one pharmacophore that was selective for agonist ligands for 25 NRs out of the 26 used, and at least one pharmacophore selective for antagonist ligands for 9 NRs out of 26. As presented in the “Methods” section, all these pharmacophores were gathered into two groups: “SB agonist selective pharmacophores” and “SB antagonist selective pharmacophores”, and redundant pharmacophores were removed. The average recall for the “SB agonist selective pharmacophores” was of 55 %, ranging from 0 % for SF1 to 98 % for PPAR_gamma whereas the average recall for “SB antagonist selective pharmacophores” was of 8 %, ranging from 0 % for AR, CAR, ERR_alpha, FXR_alpha, LXR_alpha, LXR_beta PPAR_alpha, PPAR_beta, PPAR_gamma, PXR, RAR_beta, RAR_gamma, RXR_beta, SF1, TR_alpha, TR_beta and VDR to 61 % for RAR_alpha (Fig. 2; Table 1). The average MCC value for the “SB agonist selective pharmacophores” was of 0.484, ranging from 0.088 for CAR (the SF1 MCC value was ND) to 0.881 for RXR_alpha whereas the average MCC value for “SB antagonist selective pharmacophores” was of 0.326, ranging from 0.097 for RXR_alpha (the MCC value was ND for AR, CAR, ERR_alpha, FXR_alpha, LXR_alpha, LXR_beta PPAR_alpha, PPAR_beta, PPAR_gamma, PXR, RAR_beta, RAR_gamma, RXR_beta, SF1, TR_alpha, TR_beta and VDR) to 0.712 for RAR_alpha (Table 1).

Table 1 Recalls (R), specificity (Sp) and MCC values obtained using the SB approach, the LB approach and the combination of SB and LB approaches (SBLB) for each NRLiSt BDB dataset

Full size table

Ligand-based pharmacophore modeling

Ligands clustering

To perform the LB pharmacophore modeling approach, the ligands of each NRLiSt BDB dataset were clustered using the Pharmacophore RDF-Code similarity. The cluster distance was set to 0.4 for the majority of the datasets but was lowered to 0.3 for 15 datasets (AR_agonist, ERR_alpha_agonist, GR_agonist, LXR_alpha_agonist, LXR_beta_agonist, PR_agonist, RXR_alpha_agonist, RXR_beta_agonist, RXR_beta_antagonist, RXR_gamma_agonist, TR_alpha_agonist, TR_alpha_antagonist, TR_beta_agonist, TR_beta_antagonist, VDR_antagonist) and to 0.2 for 1 dataset (RXR_alpha_antago). From 1 cluster (for the ERR_alpha_agonist, ROR_gamma_antagonist, and RXR_gamma_antagonist datasets) to 65 clusters (for the ER_alpha_agonist dataset) were generated, with an average of 18 clusters per dataset and a mean value of 7.8 ligands per cluster.

3D ligand-based pharmacophores

Using the screening protocol described in the “Methods” section, we succeeded in generated pharmacophores that were selective for agonist ligands and pharmacophores selective for antagonist ligands for each of the 27 NRs of the NRLiSt BDB. All these pharmacophores were gathered into two groups according to their selectivity for agonist or antagonist ligands, “LB agonist selective pharmacophores” and “LB antagonist selective pharmacophores”. Redundant pharmacophores were eliminated. The “LB agonist selective pharmacophores” were associated with an average recall of 97 % and a mean value of 0.918. The lower recall and MCC value were respectively of 88 % for TR_alpha and 0.253 for PPAR_gamma; the higher recall and MCC values respectively reached 100 % and 1 for CAR, ERR_alpha, PPAR_beta, RAR_beta, ROR_alpha, ROR_gamma, RXR_gamma and SF1. The “LB antagonist selective pharmacophores” presented an average recall of 99 % and a mean MCC value of 0.99, and the individual recall and MCC values were equal to 100 % and 1 for all antagonist datasets but 5 (ER_alpha, ER_beta, GR, PR, RXR_alpha) (Fig. 3; Table 1).

Combination of structure-based and ligand-based pharmacophores

3D SBLB pharmacophores performance

The “SB agonist selective pharmacophores” and “LB agonist selective pharmacophores” on the one hand and the “SB antagonist selective pharmacophores” and “LB antagonist selective pharmacophores” on the other hand were respectively concatenated into two groups: “SBLB agonist selective pharmacophores” and “SBLB antagonist selective pharmacophores”. Using these combinations of SB and LB pharmacophores, average recalls of 99.7 and 99.9 % and mean MCC values of 0.993 and 0.999 were obtained for agonist and antagonist datasets respectively. The “SBLB agonist selective pharmacophores” were able to retrieve all agonist ligands and no antagonist ligands (i.e. recall of 100 % and MCC values of 1) for all NRs but 5 (LXR_alpha, LXR_beta, PR, RAR_alpha and RAR_gamma). Similarly, the “SBLB antagonist selective pharmacophores” were able to retrieve all antagonist ligands and no agonist ligands (i.e. recall of 100 % and MCC values of 1) for all NRs but 3 (ER_alpha, GR, PR) (Fig. 4; Table 1).

Pharmacophores composition

The “SBLB agonist selective pharmacophores” group contained 413 pharmacophores (from 1 pharmacophore for ROR_gamma to 52 pharmacophores for PR) whereas the “SBLB antagonist selective pharmacophores” group contained 305 pharmacophores (from 1 pharmacophore for CAR, ERR_alpha, ROR_gamma and RXR_gamma to 64 pharmacophores for PR) (Fig. 5). The number of pharmacophores that were necessary to cover a given dataset is significantly correlated with the number of ligands in the dataset (Kendall’s tau coefficient, p value = 9.55e−15, Additional file 1: Figure S1). These pharmacophores were composed of 3–16 features, with a median value of 5 features per pharmacophore (Fig. 6; Additional file 1: Figure S2A−N, Additional file 1: Tables S1−S54). Pharmacophore features were mainly hydrophobic groups and hydrogen bond acceptors (39.3 and 32.5 % of the total of all pharmacophores features of the 718 SBLB pharmacophores), but aromatic rings and hydrogen bond donors represented also an important part of the pharmacophore features (14.0 and 9.2 % respectively) far ahead negative and positive ionisable area (Fig. 7). These proportions were similar when agonist and antagonist data sets were considered separately (Fig. 7). However, when comparing the SBLB agonist and antagonist pharmacophores for each NR (Fig. 8), some significant differences (p-value <0.05) appeared in the pharmacophore features distribution (Additional file 1: Figure S3A−C). Thus, for respectively 9, 5, 4, 2 and 1 NRs, the SBLB agonist selective pharmacophores included significantly less HBA, hydrophobic, AR, PI and NI features than the corresponding SBLB antagonist selective pharmacophores. Similarly, for 3NRs, the SBLB antagonist selective pharmacophores included significantly less HBD features than the SBLB agonist selective pharmacophores. Each pharmacophore allowed to retrieve from 1 to 1299 ligands, with an average value of 32 ligands retrieved per pharmacophore (Additional file 1: Figure S2A−N, Additional file 1: Tables S1−S54).

Pharmacophores selectivity

To evaluate the pharmacophores selectivity for their dedicated NR ligands, each “SBLB agonist selective pharmacophores” and “SBLB antagonist selective pharmacophores” combinations were screened against all the other NRLiSt BDB datasets of ligands. The corresponding recalls are displayed in Fig. 9. The average recall of this large scale cross-screening was of 19.8 %. The “SBLB agonist selective pharmacophores” were associated with higher recalls with an average value of 28.8 versus 10.8 % for the “SBLB antagonist selective pharmacophores”. The most selective combination of pharmacophores was the PPAR_beta “SBLB antagonist selective pharmacophores” with an average recall of 0.001 %, and the less selective pharmacophores were the PPAR_gamma “SBLB agonist selective pharmacophores” with an average recall of 76 %. For 29 combinations of pharmacophores, the average recall was below 10 %. For only 8 combinations of pharmacophores, the average recall was above 50 %. This selectivity was significantly correlated with the number of ligands in the dataset that was used to generate the pharmacophores (Kendall’s tau coefficient, p-value = 3.476e−8, Additional file 1: Figure S4) and with the number of pharmacophores included in the combination for the considered dataset (Kendall’s tau coefficient, p-value = 5.915e−5, Additional file 1: Figure S5). The selectivity could also be correlated with the active ligands over decoys ratio (Kendall’s tau coefficient, p-value = 4.461e-11, Additional file 1: Figure S6).