 Research
 Open access
 Published:
ECConf: A ultrafast diffusion model for molecular conformation generation with equivariant consistency
Journal of Cheminformatics volume 16, Article number: 107 (2024)
Abstract
Despite recent advancement in 3D molecule conformation generation driven by diffusion models, its high computational cost in iterative diffusion/denoising process limits its application. Here, an equivariant consistency model (ECConf) was proposed as a fast diffusion method for lowenergy conformation generation. In ECConf, a modified SE (3)equivariant transformer model was directly used to encode the Cartesian molecular conformations and a highly efficient consistency diffusion process was carried out to generate molecular conformations. It was demonstrated that, with only one sampling step, it can already achieve comparable quality to other diffusionbased models running with thousands denoising steps. Its performance can be further improved with a few more sampling iterations. The performance of ECConf is evaluated on both GEOMQM9 and GEOMDrugs sets. Our results demonstrate that the efficiency of ECConf for learning the distribution of low energy molecular conformation is at least two magnitudes higher than current SOTA diffusion models and could potentially become a useful tool for conformation generation and sampling.
Scientific Contributions
In this work, we proposed an equivariant consistency model that significantly improves the efficiency of conformation generation in diffusionbased models while maintaining high structural quality. This method serves as a general framework and can be further extended to more complex structure generation and prediction tasks, including those involving proteins, in future steps.
Key points

A novel ultrafast equivariant diffusion model, ECConf, was proposed for lowenergy conformation generation by construction of a consistency process.

Compared with other SOTA diffusion models running with thousands denoising steps, ECConf can achieve comparable quality with only one sampling step and keep improving with a few more sampling iterations.

The efficiency of ECConf is at least two magnitudes higher than current SOTA diffusion models.

The ECConf is universal and can be easily extended to various conformation generation tasks such as protein–ligand docking pose.
Introduction
Three dimensional conformations of a molecule can largely influence its biological and physical properties and biological active conformations are usually lowenergy conformers [1]. Thus, many drug design strategies, including structurebased [2] or ligandbased virtual screening [3,4,5], threedimensional quantitative structure–activity relationships (QSARs) [6], and pharmacophore modeling [7], require a fast 3D conformation generation and elaboration protocol for sampling biologically relevant conformations. For the former task, programs such as CONCORD [8], CORINA [9], and OMEGA [10] are the most popular applications. These approaches make use of known optimal geometries of molecular fragments that are used as templates for constructing reasonable, low energy 3D models of small molecules. These methods usually utilize fallback strategies for structure generation when novel structures appear and belong to rule based approaches. However, because producing a single 3D structure of a flexible molecule is almost invariably followed by conformational elaboration, it is the latter process that is both the time and quality bottleneck.
With the advancement of deep learning technologies, deep learning methods have been used to learn the distribution of 3D bioactive conformation and generate 3D conformer directly. For a given molecules, one of the most valuable applications of conformation generative model is to generate the conformation ensembles satisfied Boltzmann distributions, which is critical for fastestimating the freeenergy. The biggest difference between the abovementioned traditional 3D conformation generators and deep learningbased conformation generative models is that those generative models don’t rely on explicit rules to construct 3D conformation but learn implicitly the distribution of conformation data. Mansimov and coworkers reported the early attempt CVGAE to generate 3D conformations in Cartesian coordinates using the variational autoencoder (VAE) architecture in oneshot [11]. However, its performance is not comparable with traditional rulebased methods. Simm et al. proposed a conformation generative model based on distance geometry instead of directly modeling the distributions on Cartesian coordinates, named GraphDG [12], while ConfVAE take the distribution of distances as intermediate variables to generate conformations [13]. Ganea et al. proposed GeoMol [14], which, at the first step, constructs the local structure (LS) by predicting the coordinates of nonterminal atoms, and then refines the LS by the predicted distances and dihedral angles. The quality of generated conformation in these oneshot methods has reached the level of traditional methods on fragments, while there are still improving rooms for druglike molecules.
Another type of attempt explores the construction of conformation via a set of sequential samplings instead of oneshot generation. Xu et al. proposed CGCF method combining the advantages of normalizing flows and energybased approaches to improve the prediction of multimodal conformation distribution [15]. Then, Xu et al. proposed another scorebased method ConfGF by learning the pseudoforce on each atom via force matching and obtaining new conformations via Langevin Markov chain Monte Carlo (MCMC) sampling on the distance geometry [16]. Its performance on the GEOMDrugs dataset is comparable to that of a rulebased method called experimentaltorsionknowledge distance geometry (ETKDG), which is a conformation generation model implemented in RDKit [17, 18]. Xia and Liu et al. proposed DMCG trained with a dedicated loss function [19]. GeoDiff [20]and SDEGen [21] employed diffusion model for conformation generation and can directly predict the coordinates without using intermediate distances. Similarly, models based on torsion diffusion was also proposed to generate conformations in torsion space instead of Cartesian coordinates [22]. Although these diffusion methods based on DDPM [23], SGMs [24, 25] or stochastic differential equations (Score SDEs) [26] improve the quality of generated conformations of druglike molecules, the slow sampling speed, which due to large number of diffusion/denoising iteration, still limit their application.
More recently, Song et al. proposed consistency model, a new diffusion model which can generate high quality samples by directly mapping noise to data instead of through the reversetime SDE [27]. Their results show that consistency models only require a few steps (2 ~ 5) of refinement for high quality image generation, which inspired us to incorporate the consistency diffusion process into molecular conformation generation. Here, we proposed an equivariant consistency model (ECConf) for fast diffusion of molecular conformation, largely achieving a balance between the conformation quality and computational efficiency. ECConf is inspired from the consistency model based on the probability flow with ordinary differential equation (ODE) and can smoothly transform the conformation distribution into a noise distribution with a tractable trajectory satisfied SE (3)equivariance. Instead of solving the reversetime stochastic differential equation (SDE) in diffusion models, ECConf can either directly map noise vector of prior Gaussian distribution to lowenergy conformation in Cartesian coordinates or the solutions in the same diffusion trajectory to molecular conformation. These characteristics are critical for decreasing the number of iterations while still keeping its high capacity to approximate the complex distribution of conformations. The performance of ECConf is evaluated in both GEOMQM9 and GEOMDrugs datasets. Our model outperforms nondiffusion models and is comparable with SOTA diffusion models, but with at least two magnitudes higher denoising efficiency.
Theory and methods
Problem definition
Molecular conformer generation is defined as a conditional generative problem, which generating lowenergy conformations C with a given 2D molecular graph \(G\). For given multiple molecular graphs \(\mathcal{G}\), we aim to learn a transferable generative model \({p}_{\theta }\left(CG\right)\) mapping the Boltzmann distribution of C under condition of \(G\in \mathcal{G}\) to a prior Gaussian distribution.
Diffusion process for conformation generations
As mentioned above, high quality conformation generations are still challenging for druglike molecules and big biomolecules. Among commonly used generative models, GAN models are suffering from the unstable training and less diversity in generation due to their adversarial training nature [28]. VAE models rely on a surrogate loss, limits its performance on high quality generation [29]. Similarly, Flowbased models have to employ a specialized architectures to construct reversible transform [30]. Compared with GAN, VAE and Flowbased models, diffusionbased methods are generally more stable, robust for training, can capture the underlying data distribution more comprehensively and produce highquality, highfidelity conformations but suffering from the sampling efficiency in conformation generations [31].
In DDPMbased diffusion process like GeoDiff, noise from fixed posterior distributions \(q({C}^{t}{C}^{t1})\) is gradually added until the ground truth conformation \({C}^{0}\) is completely destroyed with \(T\) steps. During generation process, an initial state \({C}^{T}\) is sampled from standard Gaussian distribution, the conformation is progressively refined via the model learned Markov kernels \({p}_{\theta }({C}^{t1}G,{C}^{t})\) for a given molecular graph \(G\). Song et al. have demonstrated that this diffusion process can be described as a discretization process on the time and noise in form of SDE as defined in Eq. (2). [32]
This process can be reversed by solving the following reversetime SDE:
where, \(f(C,t)\) and \(g\left(t\right)\) are diffusion and drift functions of the SDE, \(w\) and \(\overline{w }\) are the standard Brownian motion when time flows forward and backwards respectively, and \({\nabla }_{C}log{q}_{t}\left(C\right)\) is the gradient of the log probability density \({\nabla }_{C}logp\left(C\right).\) The rough propagation path for between target distributions and prior gaussian distributions largely limits the efficiency of sampling.
The denoising process can also be described in the form of ordinary differential equation (ODE), representing the probability flow to the same marginals as the reversetime SDE, i.e. the PF ODE method:
It allows a smoother propagation between distributions to accelerate the sampling phase. However, ODE trajectory still follows the chain rules, limiting the efficiency.
Karras et al. further simplified the Eq. (3) by setting \(f\left(C,t\right)=0\) and \(g\left(t\right)=\sqrt{2t}\). In this case, the \({p}_{t}\left(C\right)={p}_{data}\left(C\right)\otimes \mathcal{N}\left(0,{T}^{2}I\right)\) and a empirical PF ODE was obtained as shown in Eq. (4) [33]:
which allows us to sample \({\widehat{C}}_{T}\) from \(\pi =\mathcal{N}\left(0,{T}^{2}I\right)\) to initialize the empirical PF ODE and solve it backwards in time with any numerical ODE solver including Euler and Heun solvers. From there, we then get a solution trajectory \({\left\{{C}_{t}\right\}}_{t\in [0,T]}\) and the \({\widehat{C}}_{0}\) is an approximate sample from the conformation distribution.
Based on the ODE trajectory \({\left\{{C}_{t}\right\}}_{t\in [0,T]}\), Song et al. proposed the consistency model by directly mapping the points on the same trajectory to the same initial point in diffusions, greatly improving the sample efficiency [27].
Equivariant consistency models for conformation generations
Inspired from the consistency model, we proposed a novel equivariant diffusion model, named ECConf, for ultrafast diffusion of molecular conformation in cartesian system, largely achieving a balance between the conformation quality and computational efficiency.
Basic Concepts. Given a solution trajectory \({\left\{{C}_{t}\right\}}_{t\in [\epsilon ,T]}\) of the ODE in Eq. (4), where \(\epsilon \to 0\), we desire an equivariant consistency function \(f:\left({C}_{t},tG\right)\to {C}_{\epsilon }\) which satisfies both the boundary condition and the selfconsistency as shown in Eqs. (5) and (6), respectively.
As proved by Song et al., all the solutions in \({\left\{{C}_{t}\right\}}_{t\in [\epsilon ,T]}\) on the ODE trajectory can directly be mapped to the original ground truth \({C}_{0}\) if both of these two conditions are satisfied. It allows a smoother transformation between \({p}_{data}(CG)\) to \(\pi \sim \mathcal{N}(0,{T}^{2}I)\) to accelerate the diffusion process than DDPM and ScoreSDE models.
Additionally, the consistency function should also be SE (3)equivariant, which means a molecular coordinate \(C\) are allowed to be changed via translation and rotation in 3D space, while its scalar properties should still be invariant. Formally, a function \(f:\mathcal{X}\to \mathcal{Y}\) is equivariant to a group of transformation \({\mathbb{G}}\), thus:
where \({D}_{\mathcal{X}}\left(g\right)\) and \({D}_{\mathcal{Y}}\left(g\right)\) are transformation matrices parameterized by g in coordinates system \(\mathcal{X}\) and \(\mathcal{Y}\).
These characteristics not only allow ECConf model to make one step generation from the prior distribution but also can improve the quality of generation by chaining the outputs of consistency models at multiple time steps as shown in Fig. 1b.
Here, we parameterize the equivariant consistency model with a learnable function \({f}_{\theta }\) using skip connections as shown in Eqs. (8 ~ 10):
where \({F}_{\theta }\) is an \((G,t)\) related noise model fulfilling the SE (3) equivariance. It ensures the SE (3) equivariance of \({f}_{\theta }.\) It is evident that as \(\epsilon \to 0,\) \({c}_{skip}\left(\epsilon \right)\to 1\) and \({c}_{out}\left(\epsilon \right)\to 0\) for \(t=\epsilon\), making \({f}_{\theta }(C,\epsilon G)={C}_{\epsilon }\), thus satisfying the boundary conditions. Due to \({c}_{skip},{ c}_{out}, {and F}_{\theta }\) are both differentiable functions, we can train the \({f}_{\theta }\) by minimizing the prediction difference between time steps \(t\) and \({{t}^{\prime}}\) to satisfy the selfconsistency, as illustrated in Eq. (11).
Model Architecture In principle, ECConf could utilize any type of graph conditioned equivariant neural network for modelling \({F}_{\theta }\). Here, we employed a customized EquiFormer [34] neural network, which is a transformer model based on the irreducible representation and depthwise tensor production (DTP) based equivariant attention mechanism. This model incorporates embedding of time step factor \({t}_{n}\). The overall architecture of ECConf is shown in Fig. 2a. The time factor \({t}_{n}\) and the atom type \({Z}_{i}\) are embedded with a linear layer respectively and added together as the input of EquiFormer. By repeated stacking equivariant graph attention (EGA) modules and feedforward modules, Equiformer integrates the content and geometric information to predict the coordinates changes \(\delta {C}_{{t}_{n}}\) from time step \({t}_{n}\). The coordinates \({C}_{{t}_{n+1}}\) are then computed by combining \({C}_{{t}_{n}}\) and \(\delta {C}_{{t}_{n}}\) using skip connections.
The architecture of EGA module is illustrated in Fig. 2b. For a pair of neighboring nodes \(\{i,j\}\) in graph \(G\), their embeddings \({x}_{i}\), \({x}_{j}\) are added after a linear transformation and updated through an equivariant DTP, incorporating their distance \({r}_{ij}\). The updated embeddings are reshaped into scalar features \({f}_{ij}^{0}\) and irreps features \({f}_{ij}^{L}\), which are sent to scalar and irreps feature blocks, respectively. Finally, the updated scalar features \({a}_{ij}\) and irreps features \({v}_{ij}\) are multiplicated to update \({x}_{i}.\) By summing over all neighbors of node \(i\) as message passing neural network, both the content and geometric information are incorporated into its embedding \({x}_{i}\). A more detailed description of EGA can be found from reference 34.
As the time factor \({t}_{n}\) and atom type are independent to the coordinates, thus, this modification won’t affect the SE (3)equivariance of EquiFormer.
Model Training and Conformer Generation For a conformation \(C\) sampled from the dataset, we use \(C+{t}_{n+1} \cdot z\) and \(C+{t}_{n}\cdot z\) to replace \({C}_{n+1},{C}_{n}\) on the ODE trajectory \({\left\{{C}_{t}\right\}}_{t\in [\epsilon ,T]}\), thus the \({f}_{\theta }\) can be trained with the following loss function:
To make the training process more stable and improve the final performance of \({f}_{\theta }\), exponential moving average (EMA) technique are adopted as mentioned in Ref 27. Here, we create another function \({f}_{{\theta }^{}}\) with the EMA of original parameter \(\theta\) during training process, and minimize the difference between \({f}_{\theta }\left(C+{t}_{n+1}\cdot z,{t}_{n+1}G\right)\) and \({f}_{{\theta }^{}}\left(C+{t}_{n}\cdot z,{t}_{n}G\right)\) as Song et al.’s work [27]. Here, we refer \({f}_{\theta }\) as “online network” and \({f}_{{\theta }^{}}\) as the “target network” for clarity. The loss function is reformulated as following:
Parameter \(\theta\) is updated with stochastic gradient descent, wile \({\theta }^{}\) is updated with exponential moving average as shown in Eq. (14), where \(\mu\) is the decay rate predefined by EMA schedule.
By doing in this way, we could perform the equivariant consistency training to get the approximate function \({f}_{\theta }\) to \(f\) as shown in Fig. 3a and Algorithm S1.
In the conformer generation phase, samples are drawn from the initial distribution \({\widehat{C}}_{T}\sim N\left(0,{T}^{2}I\right)\) and the consistency model is used to generate conformers: \({\widehat{C}}_{\epsilon }={f}_{\theta }({C}_{T},T)\). The conformers are refined with greedy algorithm by alternating denoising and noise injection steps with a set of time points \(\{{\tau }_{1},{\tau }_{2}\cdots {\tau }_{N1}\}\) as shown in Fig. 3b and Algorithm S2.
Computational details
Following previous work, we also used GEOMQM9 [35] and GEOMDrugs datasets [36] for evaluation. To make a fair benchmark study, we used the same training, validation and test sets produced by Shi et al. for both datasets [13]. Specifically, the GEOMQM9 is split into training, validation and test set of 39,860, 4979, 200 unique molecules, corresponding to 199,300, 24,895 and 797 conformations; The GEOMDrugs set contains 39,852, 4983 and 200 molecules in training, validation and test set, corresponding to 199,260, 24,915 and 15,864 conformations, respectively. We examined model performance on the same test set used by GeoDiff model. [20] In current study, following hyperparameters \({\sigma }_{data}=0.5, \epsilon ={10}^{8},T=80\) were used. For a given molecule, in case there are K ground truth conformations in test set, 2 K conformations are sampled for evaluation.
In our benchmark, 8 recent or established SOTA models including GraphDG [12], ConfVAE [13], GeoMol [14], CGCF [15], ConfGF [16], GeoDiff [20], SDEGen [21], and ETKDG [18] in RDKit. The results of GraphDG, CGCF, ConfVAE, ConfGF were taken from previous study [20], while the performance of ETKDG, GeoMol, SDEGen and GeoDiff were evaluated by ourselves on the same test set using the provided models with various settings.
Model performance was evaluated by measuring the quality and diversity of conformations generated by different models. Here, we mainly compared four metrics built on the root mean square deviation (RMSD) proposed by Ganea et al., defined as the normalized Frobenius norm of two atomic coordinate matrices after Kabsch alignment [14]. Formally, let \({S}_{g}\) and \({S}_{r}\) denote the set of generated conformations and the set of reference conformations, respectively. Then, the coverage and matching measures following the traditional Recall measure can be defined as follows:
where δ is a predefined threshold. The other two Precision based metrics, COVP and MATP can be defined similarly, but with the generated and referenced set exchanged as shown in Eqs. (17 , 18).
In practice, the \({S}_{g}\) of each molecule is set as two times of the size of \({S}_{r}\). Intuitively, the COV score measures the percentage of structures in one set covered by another set, where covering means that the RMSD between two conformations is within a certain threshold δ. In contrast, the MAT score measures the average RMSD in one set with its closest neighbor in the other set. In general, a higher COV score or a lower MAT score indicates generating more realistic conformations. Moreover, the Precision metric measures the proportion of the true lowenergy conformation among all generated conformations by the model. Here, high precision means that most generated conformation coming from the target lowenergy conformation distributions within the error threshold of \(\sigma\). The Recall measures the proportion of lowenergy conformation generated by models among the true conformations in datasets. High recall means the model can generate a wide variety of lowenergy conformations that cover the entire target distribution. Briefly, the Precision reflects more on quality while Recall metric reflects on diversity. Following previous works, \(\delta\) is set as 0.5 Å and 1.25 Å for GEOMQM9 and GEOMDrugs dataset respectively.
Results and discussions
The performance of ECConf with different diffusion steps
The first thing to investigate is how diffusion step influences the model performance. Various models were trained by using different diffusion time steps and evaluated on a random test set. For ECConf, user can set the maximal diffusion steps in both forward diffusion steps and reverse iteration process, corresponding to the training and generation phase. We first evaluated the performance of ECConf trained with different maximal time steps of 2, 5, 10, 15, 20, 25, 50, 150 in forward ODE process, the iteration steps during generation are the same as training phase and the results are as shown in Table 1. It seems that both COVR and MATR metrics reached best level when the diffusion step was set to 25. The performance on COVR and MATR got improved when the diffusion step increased from 2 to 25, while it got worse for 50 and 150. There are two possible reasons for the performance decreasing with the increase of diffusion steps. First, ECConf are trained by minimizing the error between adjacent steps during the forward diffusion process. During sampling, the error between adjacent iteration steps accumulates with the increase in sampling iterations, causing the final results to deviate from the ground truth structure. Second, since EcConf learns the evolution relationship between diffusion time and structure, instead of only learning the relation between structure and its noise as in DDPM. Thus, an excessive number of diffusion steps increases the difficulty of learning, decreasing the overall effectiveness.
For COVP and MATP, the performance of ECConf with 5 diffusion steps has reached 0.857 and 0.356 Å, indicating ECConf have learned the main features of the main lowenergy conformations, but its distribution learning ability still needs improvement. The COVP and MATP reached the top when the time step was set to 15, and got worse for number of iteration larger than 15. The COVR increased 6% with the increase of diffusion step from 15 to 25, while COVP only decreased 3%. Take both Recall and Precision measurement into consideration, it seems that the models with diffusion step of 25 in the forward ODE trajectories gave best result in general and it was set as optimum diffusion step in training for the following experiments.
Performance metrics of ECConf under various sampling iterations
Once ECConf is trained, it allows either one step generation from the prior Gaussian distribution or carrying out iterative refinement via chaining the outputs of multiple time steps as shown in Algorithm S2. Here, we evaluate the performance of ECConf under various sampling iterations. In the meantime, rule based ETKDG method and 7 MLbased baselines were also compared, including oneshot methods: GraphDG, ConfVAE, GeoMol and iterative refinement methods: CGCF, ConfGF, SDEGen, and GeoDiff. Additionally, RDKit structure fast correction (FC) option can be introduced to correct the abnormal bond lengths and angles in MLbased methods by using the bond lengths and angles of MMFF force field [37, 38]. Here, we focus on the performance of three corrected diffusion models, ie. SDEGenFC, GeoDiffFC and ECConfFC.
For QM9 dataset, the recall measurement of COVR and MATR are as shown in Table 2 and Fig. 4a, b, representing the diversity of generated conformations. It’s clear that the diffusionbased models outperformed most oneshot models except GeoMol, suggesting their superior capability in reproducing ground truth conformations. Interestingly, oneshot generation of ECConf is already comparable with oneshot models, and the result for two iterations is almost the same as SDEGen model running 1500 iterations, i.e., sampling efficiency improved almost 750 times. The performance of the ECConf on the diversity gradually converged with more than 5 iterations. Although the recall performance is somewhat lower than GeoDiff model, the improvement on sampling efficiency of 1000 times could justify that ECConf model may be a better choice in dealing with large number of molecules. The precisionbased metrics of COVP and MATP are also as shown in Table 2, where the ECConf outperformed all the ML baselines indicating its superior quality of conformation generation under all sampling iterations. The optimal performance of ECConf was obtained at around 25 iterations steps on GEOMQM9 dataset.
We also evaluate our ECConf models on druglike molecules with maximum of 50 heavy atoms in GEOMDrugs set, which is more challenging for oneshot baselines. The recallbased metrics of COVR and MATR are shown in Table 3 and Fig. 4c, d. Both the GeoDiff and ECConf outperform the best oneshot generative model, GeoMol. Again, even the oneshot generation of ECConf greatly outperformed those oneshot baselines and some of the iterative methods such as CGCF, ConfGF, SDEGen. The performance of ECConf generation with 5 iterations already performed better than that of GeoDiff with 1000 iterations, representing 200 times better efficiency. In the concept of precisionbased metrics on GEOMDrugs set, although GeoMol outperformed the rulebased RDKit ETKDG method generally, performance of ECConf model is quite close. Among all deep learningbased models, ECConf with a single iteration outperformed other models and achieved the best results with 15 sampling iterations. To intuitively demonstrate the conformation generation efficiency of ECConf, the conformation evolution in the sampling phase is shown in Fig. 5. It is evident that for ECConf, a single iteration already roughly shapes the structure, with the quality of the structure continuously improving over the next 25 steps. In contrast, for SDEGen and GeoDiff, the molecular structure only starts to take shape after 3500 iterations. These results further demonstrate the efficiency of ECConf in conformation generation.
Due to the FC option slightly adjust the ML generated structures, it decreases the model performance on both GEOMQM9 and GEOMDrugs test set, but it can optimize force field calculated molecular energy as discussed later. Among three corrected diffusion models, ECConfFC outperforms other methods in general, only get slightly worse value on COVR median as shown in Table S3.
Some sample conformations generated by selected models are shown in Fig. 6 to provide a qualitative comparison, where ECConf is shown to nicely capture both local and global structures in 3D space. Additionally, SDEGen fails to generate reasonable structures for some molecules.
C. The quality of ECConf generated structures on GEOMDrugs
As discussed above, machine learning methods exhibit advantages in terms of generating diverse conformations and achieving precision in reproducing ground truth conformations. Here, we evaluated the quality of ECConf generated conformations by examining their deviation to the optimized conformations with MMFF94 force field, and the ground truth structures with the lowest MMFF94 energy. Here, we evaluated both structural deviation and conformational energy. Comparison with structures generated by other methods, including ETKDG, SDEGen with 6000 iterations and GeoDiff model with 5000 denoising iterations. Additionally, the results from FC corrected conformations to the original optimized structure and ground truth were also compared.
The average and minimum root mean square deviation (RMSD) of the generated structures to the optimized structures on GEOMDrugs dataset are as depicted in Fig. 7 a, b, respectively. ETKDG structures have the smallest average deviation from their optimized structures (with a median value of 0.905 Å), and the order of increasing average deviation is: ETKDG < ECConf < GeoDiff < ECConfFC < GeoDiffFC < SDEGenFC < SDEGen. The structural deviation to the ground truth structures also follows the same trend as shown in Fig. 7 c, d.
Conformational energy can also serve as an indicator of structure quality, as structures with abnormal bond lengths or ring configurations can significantly increase internal energy. For methods not using FC option, GeoDiff has the smallest energy difference between generated structures and their optimized counterparts (with a median value of 0.68 kcal/mol/atom) and the order is: GeoDiff < ETKDG < ECConf < SDEGen as depicted in Fig. 4e. We need to note that SDEGen generates many unreasonable structures difficult to optimize, making optimized structures energy is even higher than FC corrected structures. Thus, the analysis of energy difference between SDEGenFC and its optimized structures are meaningless. The energy difference to the ground truth follows the same trend. These findings highlight that GeoDiff model generates energywise most favorable conformations. However, when the FC option was used, the energy difference between generated conformation and its optimized conformation for both ECConfFC and GeoDiffFC model decrease dramatically, which are 0.25 and 0.29 kcal/mol/atom, respectively. This suggests that user can combine ECConf with the structure correction functionality in RDKit to directly generate reasonable conformations without further optimization, which can save a lot of processing time considering the low diffusion step for ECConf. The results for GEOMQM9 set are similar to those of GEOMDrugs, as shown in Figure S1. All these results demonstrate that our ECConf model can achieve a good balance between conformation quality and sampling efficiency.
Conclusions
Here, an equivariant consistency generative model (ECConf) was proposed as an ultrafast diffusion method with only a few iterations for lowenergy conformation generation. A time factorcontrolled SE (3)equivariant transformer was used to encode the Cartesian molecular conformations and a highly efficient consistency diffusion process was carried out to generate molecular conformations, largely achieving a balance between the conformation quality and computational efficiency. Our results demonstrate that ECConf can potentially learn the distribution of low energy molecular conformation with at least two magnitudes higher efficiency than conventional diffusion models and could potentially become a useful tool for conformation generation and sampling.
Data availability
The source code is available from https://github.com/DeepLearningPS/EcConf.
References
Perola E, Charifson PS (2004) Conformational analysis of druglike molecules bound to proteins: an extensive study of ligand reorganization upon binding. J Med Chem 47(10):2499–2510
Lyne PD (2002) Structurebased virtual screening: an overview. Drug Discov Today 7(20):1047–1055
Jain AN (2004) Ligandbased structural hypotheses for virtual screening. J Med Chem 47(4):947–961
Bhunia SS, Saxena M, Saxena AK. Ligandand structurebased virtual screening in drug discovery. In: Biophysical and Computational Tools in Drug Discovery. Springer; 2021. p. 281–339.
Broccatelli F, Brown N (2014) Best of both worlds: on the complementarity of ligandbased and structurebased virtual screening. J Chem Inf Model 54(6):1634–1641
Cruciani G, Carosati E, Clementi S. Threedimensional quantitative structureproperty relationships; 2003.
Schwab CH (2010) Conformations and 3D pharmacophore searching. Drug Discov Today Technol 7(4):e245–e253
Hendrickson MA, Nicklaus MC, Milne GW, Zaharevitz D (1993) CONCORD and CAMBRIDGE: comparison of computer generated chemical structures with xray crystallographic data. J Chem Inf Comput 33(1):155–163
Gasteiger J, Rudolph C, Sadowski J (1990) Automatic generation of 3Datomic coordinates for organic molecules. Tetrahedron Comput Methodol 3(6):537–547
Hawkins PC, Skillman AG, Warren GL, Ellingson BA, Stahl MT (2010) Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Databank and Cambridge Structural Database. J Chem Inf Model 50(4):572–584
Mansimov E, Mahmood O, Kang S, Cho K (2019) Molecular geometry prediction using a deep generative graph neural network. Sci Rep 9(1):20381
Simm GN, HernándezLobato JM. A generative model for molecular distance geometry. In: ICML 2020. PMLR 2020.
Xu M, Wang W, Luo S, Shi C, Bengio Y, GomezBombarelli R, Tang J. An endtoend framework for molecular conformation generation via bilevel programming. In: ICML 2021: 2021. PMLR: 11537–11547.
Ganea O, Pattanaik L, Coley C, Barzilay R, Jensen K, Green W, Jaakkola T. Geomol: Torsional geometric generation of molecular 3d conformer ensembles. In: NIPS 2021. 2021: 13757–13769.
Xu M, Luo S, Bengio Y, Peng J, Tang J. Learning neural generative dynamics for molecular conformation generation. In: ICLR 2021. 2021.
Shi C, Luo S, Xu M, Tang J. Learning gradient fields for molecular conformation generation. In: ICML 2021: 2021. PMLR: 9558–9568.
Schärfer C, SchulzGasch T, Ehrlich HC, Guba W, Rarey M, Stahl M (2013) Torsion angle preferences in druglike chemical space: a comprehensive guide. J Med Chem 56(5):2016–2028
Wang S, Witek J, Landrum GA, Riniker S (2020) Improving conformer generation for small rings and macrocycles based on distance geometry and experimental torsionalangle preferences. J Chem Inf Model 60(4):2044–2058
Zhu J, Xia Y, Liu C, Wu L, Xie S, Wang Y, Wang T, Qin T, Zhou W, Li H: Direct molecular conformation generation. In: Transactions on Machine Learning Research. 2022.
Xu M, Yu L, Song Y, Shi C, Ermon S, Tang J. Geodiff: A geometric diffusion model for molecular conformation generation. In: ICLR 2022. 2022.
Zhang H, Li S, Zhang J, Wang Z, Wang J, Jiang D, Bian Z, Zhang Y, Deng Y, Song J et al (2023) SDEGen: learning to evolve molecular conformations from thermodynamic noise for conformation generation. Chem Sci 14(6):1557–1568
Jing B, Corso G, Chang J, Barzilay R, Jaakkola T. Torsional diffusion for molecular conformer generation. In: NIPS 2022. 2022: 24240–24253.
Ho J, Jain A, Abbeel P (2020) Denoising diffusion probabilistic models. NIPS 2020:6840–6851
Song Y, Ermon S. Generative modeling by estimating gradients of the data distribution. In: NIPS 2019. 2019.
Song Y, Ermon S. Improved techniques for training scorebased generative models. In: NIPS 2020. 2020: 12438–12448.
Song Y, SohlDickstein J, Kingma DP, Kumar A, Ermon S, Poole B. Scorebased generative modeling through stochastic differential equations. In: ICLR 2021. 2021.
Song Y, Dhariwal P, Chen M, Sutskever I. Consistency models. In: ICML 2023. 2023.
Creswell A, White T, Dumoulin V, Arulkumaran K, Sengupta B, Bharath AA (2018) Generative adversarial networks: An overview. IEEE Signal Process Mag 35(1):53–65
Kingma DP, Welling M. AutoEncoding Variational Bayes. 2013: arXiv:1312.6114.
Jimenez Rezende D, Mohamed S. Variational Inference with Normalizing Flows. In: ICML 2015: May 01, 2015. 2015: arXiv:1505.05770.
Yang L, Zhang Z, Song Y, Hong S, Xu R, Zhao Y, Shao Y, Zhang W, Cui B, Yang MH. Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:220900796 2022.
Song Y, SohlDickstein J, Kingma DP, Kumar A, Ermon S, Poole B. Scorebased generative modeling through stochastic differential equations. In: ICLR 2021. 2020.
Karras T, Aittala M, Aila T, Laine S. Elucidating the design space of diffusionbased generative models. In: NIPS 2022. 2022: 26565–26577.
Liao YL, Smidt T. Equiformer: Equivariant graph attention transformer for 3d atomistic graphs. In: ICLR 2023. 2022.
Ramakrishnan R, Dral PO, Rupp M, Von Lilienfeld OA (2014) Quantum chemistry structures and properties of 134 kilo molecules. Sci Data 1(1):1–7
Axelrod S, GomezBombarelli R (2022) GEOM, energyannotated molecular conformations for property prediction and molecular generation. Sci Data 9(1):185
Zhang X, Zhang O, Shen C, Qu W, Chen S, Cao H, Kang Y, Wang Z, Wang E, Zhang J et al (2023) Efficient and accurate large library ligand docking with KarmaDock. Nat Comput Sci 3(9):789–804
Stärk H, Ganea O, Pattanaik L, Barzilay R, Jaakkola T: Equibind. Geometric deep learning for drug binding structure prediction. In: ICML 2022: 2022. PMLR: 20503–20521.
Funding
H.C. thanks the financial support of Pearl River Recruitment Program of Talents No. 2021CX020227. This study was supported by the R&D Program of Guangzhou National Laboratory (YWYWYM0205, GZNL2023A01008, GZNL2023A01005) and Overseas Experts Supporting Programs under National Research Platform (WGZJ22001).
Author information
Authors and Affiliations
Contributions
Zhiguang Fan completed the main experiments in the article, while Mingyuan Xu designed and implemented the algorithms and code presented in the paper, and drafted the initial version of the manuscript. Hongming Chen led the overall project, providing guidance throughout, with Yuedong Yang offering input on the manuscript. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
13321_2024_893_MOESM1_ESM.docx
Supplementary file1 (DOCX 530 KB) 1. The algorithm of equivariant consistency training and sampling is as shown in Algorithm S1 and S2. 2. The median of benchmark results and 7 conformation examples on GEOMQM9 are as shown in Table S1 and Figure S1. The median of benchmark result on GEOMDrugs is as shown in Table S2. 3. Quality of generations on GEOMQM9 Datasets is as shown in Figure S2.
Rights and permissions
Open Access This article is licensed under a Creative Commons AttributionNonCommercialNoDerivatives 4.0 International License, which permits any noncommercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/byncnd/4.0/.
About this article
Cite this article
Fan, Z., Yang, Y., Xu, M. et al. ECConf: A ultrafast diffusion model for molecular conformation generation with equivariant consistency. J Cheminform 16, 107 (2024). https://doi.org/10.1186/s13321024008932
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13321024008932