 Research article
 Open Access
ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling
 Tailong Lei^{1},
 Youyong Li^{3},
 Yunlong Song^{4},
 Dan Li^{1},
 Huiyong Sun^{1} and
 Tingjun Hou^{1, 2}Email author
 Received: 5 November 2015
 Accepted: 20 January 2016
 Published: 1 February 2016
Abstract
Background
Determination of acute toxicity, expressed as median lethal dose (LD_{50}), is one of the most important steps in drug discovery pipeline. Because in vivo assays for oral acute toxicity in mammals are timeconsuming and costly, there is thus an urgent need to develop in silico prediction models of oral acute toxicity.
Results
In this study, based on a comprehensive data set containing 7314 diverse chemicals with rat oral LD_{50} values, relevance vector machine (RVM) technique was employed to build the regression models for the prediction of oral acute toxicity in rate, which were compared with those built using other six machine learning approaches, including knearestneighbor regression, random forest (RF), support vector machine, local approximate Gaussian process, multilayer perceptron ensemble, and eXtreme gradient boosting. A subset of the original molecular descriptors and structural fingerprints (PubChem or SubFP) was chosen by the Chi squared statistics. The prediction capabilities of individual QSAR models, measured by q _{ext} ^{2} for the test set containing 2376 molecules, ranged from 0.572 to 0.659.
Conclusion
Keywords
 Support Vector Machine
 Random Forest
 Consensus Model
 Random Forest Model
 Relevance Vector Machine
Background
Determination of acute toxicity in mammals (e.g. rats or mice) is one of the most important tasks for the safety evaluation of drug candidates. Acute toxicity is usually expressed as median lethal dose (LD_{50}), which is the dose amount of a tested molecule to kill 50 % of the treated animals within a given period. According to the regulations and guidelines for the toxicity testing of pharmaceutical substances established by the Organization for Economic Cooperation and Development (OECD), the U.S. Food and Drug Administration (FDA), the National Institutes of Health (NIH), the European Agency for the Evaluation of Medicinal Products (EMEA), etc., the use of alternative in vitro or in silico toxicity assessment methods that avoid the use of animals are strongly recommended [1–4]. Moreover, in vivo testing for acute toxicity is timeconsuming and costly, and therefore extensive efforts have been devoted to the development of in silico methods for toxicity.
Over past decades, a number of quantitative structure–activity relationship (QSAR) models have been developed to predict rodent acute toxicity [5–7], It is wellknown that acute toxic effect results from multiple potential modes of action (MOA), and it is quite difficult to develop a universal model with reliable prediction accuracy to an extensive data set. Therefore, most QSAR models were built from small data sets of congeneric compounds [8–10] and thus had limited application domains. Recently, several theoretical models were developed based on relatively largescale data sets with diverse compounds [9–12]. For example, Zhu et al. [10] developed five QSAR models for 7385 compounds with rat oral acute toxicity data, and the two models developed by kNN and RF achieved comparable performance for the test set (r ^{2} = 0.66 and 0.70, respectively) to TOPKAT. However, in Zhu’s study, 997 molecules were identified as outliers and eliminated from the training set. Another study reported by Raevsky [13] and coworkers proposed a socalled Arithmetic Mean Toxicity (AMT) modelling approach, which produced local models based on a knearest neighbors approach. This approach gave correlation coefficients (r ^{2}) from 0.456 to 0.783 for 10,241 tested compounds, but the prediction accuracy for a molecule depended on the number and structural similarity of its neighbors with experimental data in the training set [13]. Recently, Lu et al. [14] employed local lazy learning (LLL) method to develop LD_{50} prediction models, and the rat acute toxicity of a molecule could be predicted by the experimental data of its k nearest neighbors. A consensus model by integrating the predictions of individual LLL models yielded a correlation coefficient r ^{2} of 0.712 for the test set containing 2896 compounds. Similar to Raevsky’s approach [13], Lu’s approach relied on the priori knowledge of the experimental data of a query’s neighbors, and therefore, the actual prediction capability of this method was associated with the chemical diversity and structural coverage of the training set [15].
Due to the complicated mechanisms involved in acute toxicity, it is a difficult task to build a single QSAR model with reliable prediction accuracy by using traditional statistical approaches, such as multiple linear regression (MLR), partial least squares (PLS), principal components regression (PCR), etc. However, machine learning methods have shown promising potential to establish the complex QSARs for the data sets with diverse ranges of molecular structures and mechanisms. Certainly, each machine learning method has its intrinsic advantages, shortcomings, and practical constraints. Moreover, the performance of different machine learning methods depends on the structural diversity and representativeness of the molecules in the data set. Therefore, it is quite important to choose the most suitable machine learning method to develop the prediction model for a specific toxicity data set.
Among all existed machine learning methods, most of them may have the common problem of overtraining and overfitting in solving highdimensional and complex nonlinear problems because they usually need to estimate and optimize many hyperparameters. It is wellknown that the complexity of a model often grows linearly with the dimension of data, and thus some forms of postprocessing are required to reduce the computational complexity. In order to solve this problem, the relevance vector machine (RVM) method introduced the Bayesian criteria into learning process, and it employs a sparse prior to reduce the unrelevant support vectors of the decision boundary in feature space and gets a sparser model accordingly. Contrary to the similar algorithm, support vector machine (SVM), the penalty parameter C and the insensitiveloss parameter ε are automatically valuated and error bars are got through covariance function in the RVM regression. Meanwhile, RVM has a comparable generalization ability, and its nonzero weights reflect prototype of sampling more than SVM. Therefore, RVM may be a good choice for QSAR modelling.
In this study, based on a large public data set containing 7385 rat oral acute toxicity data compiled by the previous study [10], RVM was employed to establish the regression models for the prediction of oral acute toxicity in rat, and was compared with the other six machine learning methods, including SVM, knearestneighbor regression (kNN), random forest (RF), local approximate Gaussian process (laGP), multilayer perceptron ensemble (MPLE), and eXtreme gradient boosting (XGBoost). The performance of all the seven machine learning methods was assessed and compared by the predictive power and application domains of the models to the external test set. Moreover, the possibility to achieve better prediction of rat oral acute toxicity by combining the predictions from multiple QSAR models was explored.
Methods
Data set of rat oral acute toxicity
The rat oral LD_{50} data set with 7385 unique organic molecules reported by Zhu et al. [10] was used in our study. The quality of the data set, originally collected from different sources, was carefully verified. The acute toxicity of each molecule was expressed as log[1/(mol/kg)] (or pLD_{50}).
The SMILES of the 7385 structures in the data set were converted into 3D structures and optimized in Discovery Studio 2.5 molecular simulation package (DS 2.5) [16]. Here, 68 molecules were eliminated because some molecular descriptors of them could not be successfully generated by Molecular Operating Environment (MOE) 2009 molecular simulation package [17], and 3 molecules with pLD_{50} values higher than 7.0 or lower than 0, distantly distributed from the other data, were removed. The final data set contained 7314 molecules, which were randomly resplit into a training set with 4938 (67.5 %) molecules and an external test set with 2376 (32.5 %) molecules by weighing the distribution of their pLD_{50} values.
Calculation of molecular descriptors and molecular fingerprints
Originally, 334 descriptors to characterize the physicochemical properties, molecular representations, and druglike properties of the studied molecules were calculated by using MOE. The descriptors that had zero values or zero variance were removed. Then, the correlations across all pairs of descriptors were calculated, and the redundant descriptors with the correlation (r) higher than the predefined threshold (0.95) to any descriptor were removed. Finally, 230 descriptors were chosen for QSAR modeling. In addition, molecular fingerprints, which characterize the substructure features of a molecule, were used. Two sets of fingerprints, including the PubChem fingerprint (PubchemFP) with 881 substructure patterns, and substructure fingerprint (SubFP) with 307 substructure patterns, were generated by PaDELDescriptor software [18].
Dimension reduction by Chi squared statistics
QSAR modeling by machine learning approaches
Some important parameters used in QSAR modeling
Models  Hyperparameters 

kNN  The number of predictors at each node = 1–10 
RF  The number of predictors at each node = 105, the number of trees = 230 
SVM (RBF)  The kernel width σ = 0.03125, the penalty parameter C = 2, and ε in the loss function = 0.05 
RVM (Laplace)  The kernel width σ = 0.044 
laGP  The initial values of lengthscale = 5, the initial values of nugget = 0.1 
MPLE  The number of individual perceptrons = 18, the number of units in the hidden layer = 5–8 
XGBoost  Step size shrinkage = 0.1, maximum depth of a tree = 7, the max number of iterations = 69 
Relevance vector machine (RVM)
Support vector machine (SVM) regression
Support vector machine (SVM), under the frame of Vapnik–Chervonenkis theory, [33, 34] is one of the most popular machine learning methods used in QSAR modeling [35]. Although SVM was originally developed for classification, it can also be used for regression (or function approximation). In the case of regression, the objective is to find a hyperplane with small norm while simultaneously minimize the sum of the distances from the data points to the hyperplane. In this study, the Gaussian radial basis function (RBF) was used as the kernel, and grid search was employed for the optimization of the kernel parameter σ [36]. The penalty parameter C of the error term was set to 2, and the insensitive parameter ε in the loss function was set to 0.05.
kNearestneighbor (kNN) regression
kNN is a nonparametric learning approach for classification and regression based on the closest training examples in the feature space [37, 38]. The feature selection, the number k of nearest neighbors, and the shape of the distance weighting function determine the performance of a kNN model. Here, each molecule was eliminated from the training set and its pLD_{50} value was predicted as the inverse distance weighted average activity of the k most similar molecules, where the value of k was optimized as well (k = 1–10).
Random forest (RF)
Random forest (RF) is an ensemble learning method by combining multiple decision trees and yields the consensus predictions from individual trees [39, 40]. It randomly samples the data from the training set to construct individual trees. Each node of the tree is split using the best subset of total descriptors randomly chosen at that node. Here, a 10puzzle heuristic searching method was used to determine the most optimal parameters in RF modelling. The number of the predictors sampled for splitting at each node was set to 105, and the number of trees to grow was set to 230.
Local approximate Gaussian process regression (laGP)
laGP is a parallel approximate Gaussian Process (GP) regression algorithm for big data [41, 42]. The approximation is based on finding small local designs for independent prediction at particular inputs. A Gaussian process can be used as a prior probability distribution over functions in Bayesian inference, with finite dimensional distributions defined by a mean µ(x) and positive definite covariance \(K\left( {x,x^{\prime } } \right)\) for pdimensional inputs x and \(x^{\prime }\). For smoothing noisy data, a nugget (η/g) can be added to \(K\left( {x,x^{\prime } } \right)\) of the isotropic process. The method involves approximating the predictive equations at the local designs X _{ n }(x) close to a particular generic location x, and then calculating the local maximumlikelihood estimation. Two parameters, lengthscale (θ) and nugget (η), are quite important in Gaussian process predictive modeling. The optimum values of lengthscale and nugget will be reached by looping over each x collecting approximate predictive equations to maximize a posterior. In this study, the initial values of lengthscale and nugget were set to 5 and 0.1, respectively.
Multilayerperceptron networks ensemble (MPLE)
Multilayerperceptron network (MPL) is an artificial feedforward neural network model where information moves forward from the input nodes, through all hidden nodes, to the output nodes without loops. A MPLE model consists of multiple layers of neuron units, usually interconnected in a feedforward way [43, 44]. Each neuron in one layer directly connects to the neurons of the subsequent layer, and each neuron is a perceptron with multiple layers of neuron units. To minimize the loss function, optimization is done via the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. In this study, a softmax function (loglinear model) was used as the activation function. The number of individual perceptrons was 18, and the number of units in the hidden layer was 5–8.
eXtreme gradient boosting (XGBoost)
Gradient boosting algorithm is a machine learning technique to construct an ensemble of decision trees, and XGBoost is an efficient and scalable implementation of the gradient boosting framework [45, 46]. It develops the model in a sequential stagewise fashion like other boosting methods do, and generalizes them by allowing optimization of an arbitrary differentiable loss function. In this study, the default parameters (step size shrinkage = 0.1, maximum depth of a tree = 7, and the max number of iterations = 69) were used.
Evaluation and validation of the regression models
The conventional coefficient of determination R ^{2} (\(q_{ext}^{2}\)) was used to evaluate the predictive power of each model on the external test set. The acceptability thresholds of q ^{2} for the training set and \(q_{ext}^{2}\) for the test set were both set to ≥0.5. A model is overfitted when the difference between \(R_{adj}^{2}\) and \(q_{ext}^{2}\) is higher than 0.3 [47, 48].
Moreover, other two parameters, mean absolute error (MAE) and root mean square error (RMSE), were used to evaluate the quality of each model.
Analysis of application domain (AD)
Scaffold analysis of molecules with large prediction errors
The scaffolds for the 249 molecules with large prediction errors (MAE > 1.0) were examined systematically. The scaffolds for each molecule were characterized by four representations, including Murcko frameworks, ring assemblies, bridge assemblies, and the side chains attached to Murcko frameworks. Murcko frameworks developed by Bemis [54] were primarily used to characterize cyclic substructures of molecules. The definitions of these four scaffold representations have been described in previous studies [55, 56]. The scaffolds were generated by using the Generate Fragments component in Pipeline Pilot 7.5. The frequency of each scaffold architecture was counted, and the scaffolds were sorted by the scaffold frequency. Finally, for each scaffold with frequency equal or larger than 2, its numbers present in the training and test sets were counted.
Results and discussions
Property distributions of rat oral acute toxicity data
SlogP and logS were both related to hydrophobicity. As shown in Fig. 3, the SlogP and logS values for 90 % of the compounds in the data set were less than 8 and 2, respectively. They did not show any correlation with rat oral toxicity (R ^{2} = 0.039 and 0.057). Meanwhile, 90 % of the compounds in the database had a MW smaller than 500, and the correlation analysis showed that MW had a relatively high impact on rat oral toxicity, indicated by the slightly higher correlation (R ^{2} = 0.108). a_acc and TPSA were usually used to represent hydrophilicity, and as shown in Fig. 4, they had worse correlations with rat oral toxicity (R ^{2} = 0.029 and 0.031) than those related to hydrophobicity. The parameter vdw_vol accounted for the size or bulk of a molecule, and it had low correlation with rat oral toxicity (R ^{2} = 0.045). KierFlex and b_rotN characterized the flexibility of a molecule, and both of them had no correlation with rat oral toxicity (R ^{2} = 0.022 and 0.005). Apparently, no single descriptor showed high correlation with rat oral toxicity, and therefore rat oral toxicity could not be reliably predicted by a single or several molecular descriptors.
Comparison of various regression models for rat oral acute toxicity
Statistical results for the QSAR models based on 120 descriptors and Pubchem fingerprints for the test set
R _{adj} ^{2}  q ^{2}  \(q_{ext}^{2}\)  RMSE_{train}  MAE_{train}  RMSE_{test}  MAE_{test}  AD coverage (%)  

kNN  0.783  0.774  0.602  0.413  0.299  0.707  0.398  51.4 
RF  0.949  0.922  0.639  0.242  0.171  0.707  0.544  81.7 
SVM  0.923  0.915  0.627  0.253  0.119  0.688  0.507  58.6 
RVM  0.936  0.935  0.644  0.221  0.172  0.680  0.511  62.9 
laGP  0.775  0.756  0.614  0.430  0.322  0.713  0.550  72.2 
MPLE  0.716  0.693  0.580  0.482  0.349  0.743  0.572  78.4 
XGBoost  0.920  0.903  0.624  0.271  0.205  0.700  0.533  74.5 
Consensus  0.923  NA  0.676  0.278  0.208  0.666  0.504  71.7 
Consensus (Except MPLE)  0.933  NA  0.678  0.257  0.194  0.661  0.499  68.9 
Statistical results for the QSAR models based on 150 descriptors and Pubchem fingerprints for the test set
R _{adj} ^{2}  q ^{2}  \(q_{ext}^{2}\)  RMSE_{train}  MAE_{train}  RMSE_{test}  MAE_{test}  AD coverage (%)  

kNN  0.885  0.878  0.585  0.303  0.217  0.718  0.415  41.0 
RF  0.932  0.905  0.639  0.239  0.171  0.709  0.547  82.7 
SVM  0.953  0.948  0.606  0.199  0.086  0.710  0.527  67.0 
RVM  0.942  0.941  0.640  0.212  0.165  0.684  0.516  64.4 
laGP  0.789  0.768  0.605  0.418  0.315  0.720  0.551  73.6 
MPLE  0.654  0.633  0.572  0.527  0.382  0.754  0.580  83.4 
XGBoost  0.920  0.907  0.622  0.271  0.205  0.707  0.538  74.7 
Consensus  0.922  NA  0.669  0.284  0.215  0.676  0.515  75.7 
Consensus (Except MPLE)  0.934  NA  0.669  0.258  0.197  0.671  0.509  72.9 
Statistical results for the QSAR models based on 120 descriptors and Substructural fingerprints for the test set
R _{adj} ^{2}  q ^{2}  \(\varvec{q}_{{\varvec{ext}}}^{2}\)  RMSE_{train}  MAE_{train}  RMSE_{test}  MAE_{test}  AD coverage (%)  

kNN  0.815  0.805  0.636  0.383  0.277  0.674  0.364  46.1 
RF  0.942  0.914  0.645  0.239  0.172  0.691  0.525  76.2 
SVM  0.681  0.668  0.617  0.501  0.323  0.701  0.516  63.3 
RVM  0.934  0.933  0.655  0.224  0.172  0.662  0.498  56.4 
laGP  0.767  0.745  0.634  0.438  0.328  0.693  0.530  71.1 
MPLE  0.679  0.656  0.596  0.509  0.374  0.729  0.558  77.0 
XGBoost  0.920  0.902  0.644  0.272  0.205  0.681  0.516  67.7 
Consensus  0.888  NA  0.687  0.330  0.249  0.654  0.495  69.9 
Consensus (Except MPLE)  0.897  NA  0.689  0.314  0.237  0.652  0.493  68.5 
Statistical results for the QSAR models based on 150 descriptors and Substructural fingerprints for the test set
R _{adj} ^{2}  q ^{2}  \(q_{ext}^{2}\)  RMSE_{train}  MAE_{train}  RMSE_{test}  MAE_{test}  AD coverage (%)  

kNN  0.859  0.851  0.642  0.335  0.241  0.667  0.358  41.8 
RF  0.942  0.923  0.646  0.241  0.172  0.693  0.527  77.8 
SVM  0.751  0.736  0.638  0.446  0.272  0.682  0.500  58.4 
RVM  0.938  0.937  0.659  0.218  0.168  0.660  0.495  55.9 
laGP  0.761  0.741  0.635  0.442  0.331  0.692  0.528  68.8 
MPLE  0.651  0.630  0.591  0.528  0.384  0.735  0.563  79.2 
XGBoost  0.922  0.904  0.635  0.269  0.203  0.687  0.521  67.4 
Consensus  0.894  NA  0.689  0.323  0.242  0.652  0.493  68.8 
Consensus (Except MPLE)  0.904  NA  0.690  0.303  0.228  0.646  0.487  65.8 
As shown in Tables 2, 3, 4, and 5, the MPLE models gave the lowest \(q_{ext}^{2}\) (0.572–0.596) and the highest RMSE (0.729–0.754) and MAE (0.558–0.580) values for the test set, suggesting that they had the worst prediction capabilities. Meanwhile, their R _{adj} ^{2} (0.633–0.656) for the training set were always the lowest. As far as we know, our study was the first application of MPLE in QSAR modeling, and therefore we could not give our judgment to the predictive power of MPLE to different QSAR problems. However, according to our results, MPLE was not a good choice for this specific toxicity data set.
laGP is a parallelized version of the approximate Gaussian Process algorithm. Based on the molecular descriptors and PubchemFP fingerprint, the predictive power of the laGP models (\(q_{ext}^{2} = 0.605\,{\text{ or}}\, \, 0.614\)) was slightly better than that of the kNN models (\(q_{ext}^{2}\) = 0.585 or 0.602) while slightly worse than that of the SVM models (\(q_{ext}^{2}\) = 0.606 or 0.627). However, based on the molecular descriptors and SubFP fingerprint, the predictive power of the laGP models (\(q_{ext}^{2}\) = 0.634 or 0.635) was slightly worse than that of the kNN models (\(q_{ext}^{2}\) = 0.636 or 0.642) while slightly better than that of the SVM models (\(q_{ext}^{2}\) = 0.617 or 0.638). Therefore, overall, laGP, kNN and SVM performed similarly to this specific toxicity endpoint.
The RVM method is quite similar to the SVM algorithm in many aspects, but it can provide a fully probabilistic output. However, up to now, little information on RVM applications in QSAR modeling has been reported in the literature. According to the data shown in Tables 2, 3, 4, and 5, we observed that the RVM models (\(q_{ext}^{2}\) = 0.640 or 0.659) were obviously better than the SVM models (\(q_{ext}^{2}\) = 0.606 or 0.638). Moreover, we found that the RVM modeling was more computationally efficient than the SVM modeling because RVM did not need to estimate the error/margin tradeoff parameter C, which might reduce the computational cost. Due to better prediction accuracy and higher computational efficiency compared with SVM, we believed that RVM should have a promising potential for the practical use in QSAR modeling in the future.
The AD coverages for the established models were summarized in Tables 2, 3, 4, and 5. The kNN models always showed the smallest AD coverage for the test set. Compared with the other models, the MPLE and RF models showed relatively larger AD coverages, but the RF models could give better predictions to the test set than the MPLE models. Therefore, according to the \(q_{ext}^{2}\) and AD coverage, the RF models would give the best predictions for this data set.
In this study, two welldefined substructural fingerprints (SubFP and PubchemFP) were used. According to the predictions to the test set, the models based on the SubFP fingerprint (Tables 4, 5) were better than those based on the PubchemFP fingerprint (Tables 2, 3). It is possible that some fragments in SubFP were more closely related to acute toxicity than those in PubchemFP.
Accurate prediction of rat oral acute toxicity by consensus modeling
The statistical results showed that the theoretical models using different machine learning methods have different prediction capability and model uncertainty. A useful way to reduce the model uncertainty is consensus modeling by averaging the outputs from multiple models [69–71]. Since the consensus prediction is made based on multiple different but comparable QSAR models, it may be capable of capturing the relationship between the chemical structures of the molecules and the endpoint more efficiently than a single model. Here, four consensus models were first developed by simply averaging the predictions for the test set given by the individual models shown in Tables 2, 3, 4, and 5. All the contributions of the individual models were equal, and therefore we could avoid the limitation or overemphasis of any machine learning approach. The statistical results clearly illustrated that the consensus models had higher predictive accuracy (\(q_{ext}^{2}\) = 0.669–0.689) than any individual model. In addition, by comparing the MAEs given by the consensus versus individual models using the Wilcoxon test, we found that the improvement of the consensus models compared with all individual models was statistically significant (p < 0.01).
Analysis of molecules with large prediction errors
As mentioned above, most prediction models had good capability for the test set, but some molecules in the test set could not be well predicted by any model or even by all models. If MAE > 1.0 was used as the criteria, the MAE of chemicals with large prediction errors given by all individual models in Table 5 ranged from 1.002 to 3.486 for the test set. In total, 575 molecules could not be well predicted by any individual model in Table 5, and 249 molecules could not be well predicted by the best consensus model in Table 5. For these 249 molecules with large prediction errors, the average experimental pLD_{50} value was 3.321, which was obviously higher than that of the molecules in the training set (2.558). Therefore, the prediction for the molecules with higher pLD_{50} values are worse than those for the molecules with lower pLD_{50} values.
Experimental and predicted LD_{50} values for the 20 tested molecules with the largest prediction errors
No.  Structure  Exp.^{a}  Fingerprints  kNN  RF  SVM  RVM  laGP  MPLE  XGBoost  Cons.^{b} 

1 
 5.957  PubchemFP  3.433  3.349  2.772  3.177  2.962  2.889  2.780  3.052 
SubFP  2.286  2.614  2.756  2.534  2.614  2.566  2.533  2.558  
2 
 5.513  PubchemFP  2.590  2.529  2.461  2.694  2.514  2.743  2.638  2.596 
SubFP  2.648  2.574  2.912  2.700  2.787  2.551  2.852  2.718  
3 
 5.658  PubchemFP  3.807  2.838  3.307  3.301  3.742  2.948  2.869  3.259 
SubFP  3.340  2.777  3.070  2.848  2.867  2.733  3.008  2.949  
4 
 5.406  PubchemFP  1.819  2.830  2.371  2.727  2.481  2.801  3.047  2.582 
SubFP  2.976  2.609  2.839  2.637  2.736  2.755  2.662  2.745  
5 
 5.446  PubchemFP  3.057  2.833  2.683  2.789  2.764  2.828  2.991  2.849 
SubFP  3.756  2.974  2.897  2.775  2.842  2.972  2.617  2.976  
6 
 5.310  PubchemFP  2.588  2.595  2.589  2.861  1.843  2.817  2.564  2.551 
SubFP  2.536  2.609  3.427  3.001  2.943  2.912  2.487  2.845  
7 
 5.307  PubchemFP  4.251  3.110  2.705  2.941  3.333  2.797  2.692  3.118 
SubFP  3.171  3.319  3.094  2.487  2.706  2.568  3.074  2.917  
8 
 6.402  PubchemFP  3.136  3.148  2.672  2.915  3.189  3.726  2.868  3.093 
SubFP  4.263  3.692  4.778  4.332  4.084  4.073  3.292  4.073  
9 
 6.159  PubchemFP  3.510  3.481  2.882  3.666  3.407  3.075  3.528  3.364 
SubFP  3.693  3.885  4.097  3.825  3.771  3.496  4.417  3.883  
10 
 5.170  PubchemFP  2.641  2.758  3.202  3.004  3.119  2.816  2.821  2.909 
SubFP  2.944  2.811  3.041  2.976  2.964  2.757  2.999  2.927  
11 
 4.019  PubchemFP  1.302  2.003  2.101  2.018  1.876  2.123  1.960  1.912 
SubFP  1.326  1.871  1.944  1.860  2.030  1.955  1.867  1.836  
12 
 4.780  PubchemFP  2.685  2.642  2.686  2.598  2.793  2.529  2.517  2.636 
SubFP  2.666  2.548  2.980  2.663  2.640  2.454  2.540  2.642  
13 
 4.762  PubchemFP  2.221  2.610  2.278  2.255  2.129  2.719  2.606  2.403 
SubFP  2.861  2.498  2.692  2.854  2.758  2.642  2.563  2.695  
14 
 4.538  PubchemFP  2.284  2.576  2.483  2.533  2.575  2.921  2.165  2.505 
SubFP  2.055  2.570  2.747  2.492  2.433  2.818  2.512  2.518  
15 
 5.006  PubchemFP  2.336  3.003  2.825  2.895  3.046  3.151  2.701  2.851 
SubFP  2.889  2.945  3.324  2.832  2.789  3.357  2.920  3.008  
16 
 5.225  PubchemFP  3.292  3.740  3.101  3.310  3.369  3.088  3.369  3.324 
SubFP  3.580  3.144  3.396  3.200  3.322  2.991  3.153  3.255  
17 
 1.740  PubchemFP  3.764  3.228  2.744  3.348  3.435  2.901  3.180  3.229 
SubFP  4.029  3.657  3.312  3.659  4.052  2.756  4.250  3.674  
18 
 2.140  PubchemFP  4.421  3.509  3.397  3.555  3.886  3.466  3.482  3.674 
SubFP  4.914  3.898  3.355  3.919  4.375  3.246  3.606  3.902  
19 
 0.291  PubchemFP  2.156  2.179  1.847  1.606  2.095  2.113  2.216  2.030 
SubFP  1.923  2.230  1.509  1.760  2.088  2.053  2.192  1.965  
20 
 1.163  PubchemFP  3.159  2.766  2.636  2.635  3.018  2.793  1.973  2.711 
SubFP  3.743  2.757  2.547  2.829  3.033  2.669  2.047  2.804 
The representative scaffolds found in the tested molecules with large prediction errors (MAE > 1.0)
No.  Scaffolds  Training set  Test set  Tested molecules with large prediction errors  

N  pLD _{50} ^{a}  MAE  N  pLD _{50} ^{a}  MAE  N  pLD _{50} ^{a}  MAE  
1 
 3  3.421  0.579  3  2.689  1.941  3  2.689  1.941 
2 
 1  3.977  1.253  5  3.070  1.170  2  3.008  1.892 
3 
 4  4.140  0.444  3  4.005  1.498  2  3.972  1.836 
4 
 16  3.053  0.414  6  3.270  0.896  2  3.867  1.103 
5 
 8  3.122  0.354  4  2.919  1.268  2  3.067  1.890 
6 
 3  2.831  0.287  3  2.617  1.089  2  2.674  1.340 
7 
 13  2.916  0.360  12  3.098  0.954  7  3.293  1.364 
8 
 2  3.162  0.517  3  2.689  1.941  3  2.689  1.941 
9 
 61  2.706  0.165  12  2.720  0.917  4  2.801  1.829 
10 
 0  –  –  3  2.540  1.114  2  2.456  1.396 
Analysis of important descriptors and fragments given by RVM regression models
Statistical results for the descriptors and fingerprints used in QSAR modelling
Molecular descriptors  Number of descriptors  

120 (Descriptor + PubchemFP)  120 (Descriptor + SubFP)  150 (Descriptor + PubchemFP)  150 (Descriptor + SubFP)  
2D  
Physical properties  6  7  7  7 
Subdivided surface areas  8  9  10  11 
Atom counts and bond counts  10  10  10  11 
Kier&Hall connectivity and kappa shape indices  7  7  8  8 
Adjacency and distance matrix descriptors  11  10  13  14 
Pharmacophore feature descriptors  4  4  5  5 
Partial charge descriptors  19  20  25  27 
3D  
Potential energy descriptors  2  1  5  4 
Mopac descriptors  15  15  15  15 
Surface area, volume and shape descriptors  30  30  37  38 
Conformation dependent charge descriptors  4  5  6  6 
Fingerprints (PubchemFP)  4  –  9  – 
Fingerprints (SubFP)  –  2  –  4 
Nine PubchemFP fragment alerts and representative structures
No.  Fingerprint  Fragment  Description  Bit substructure  R _{adj} ^{2} change  Cramer’s V  Representative structure 

Positive fragment alerts  
1  PubchemFP400 
 Detailed atom neighborhoods  N(~ H)(:C)(:C)  0.00086  0.15810 

2  PubchemFP359 
 Simple atom nearest neighbors  C(~ C)(:N)(:N)  0.00162  0.15641 

4  PubchemFP770 
 Complex SMARTS patterns  Nc1c(N)cccc1  0.00167  0.14722 

5  PubchemFP833 
 Complex SMARTS patterns  NC1C(N)CCCC1  0.00100  0.14504 

6  PubchemFP527 
 Simple SMARTS patterns  C:C:N[#1]  0.00036  0.13989 

Negative fragment alerts  
1  PubchemFP15  Counts of N ≥ 2  Hierarchic element counts  ≥2 N  0.00174  0.16108 

2  PubchemFP442 
 Detailed atom neighborhoods  C(–C)(=N)  0.00024  0.15474 

3  PubchemFP418 
 Simple SMARTS patterns  C=N  0.00094  0.13903 

PubchemFP14  Counts of N ≥ 1  Hierarchic element counts  ≥1 N  0.00002  0.13650 

Four SubFP fragment alerts and representative structures
No.  Fingerprint  Fragment  Description  SMILES  R _{adj} ^{2} change  Cramer’s V  Representative structure 

1  SubFP294 
 Trifluoromethyl  [FX1][CX4;!$([H0][Cl,Br,I]);!$([F][C]([F])([F])[F])]([FX1])([FX1])  0.00173  0.15737 

2  SubFP9 
 Alkylfluoride  [FX1][CX4]  0.00024  0.15386 

3  SubFP179 
 Hetero N basic H  [nX3H1 + 0]  0.00161  0.14669 

4  SubFP275 
 Heterocyclic  [!#6;!R0]  0.00002  0.14306 

The PubchemFP fragments found in the models are relatively small, but they might be important components for toxicophores that were not defined in the fingerprint dictionary. In the SubFP fragment alerts, trifluoromethyl and alkylfluoride were often constituent parts of toxic substances, but hetero N and heterocycle might be only the background noise of models, or they may be parts of some toxic substructures not defined in the fingerprint dictionary. As been mentioned in the previous literature [14, 74, 75], some toxic chemicals contained trifluoromethyl and alkylfluoride fragments such as 2(trifluoromethyl)benzimidazole, which were not defined in the fingerprint dictionary and were substructures of many antitumor drugs, antibiotics, antiparasitics and ionic liquids [76–80]. In addition, some important substructures in toxicophores, such as organophosphates, organochlorines and norbornene derivates, did not exist in the PubchemFP dictionary. The phosphonic groups could be found in the SubFP dictionary, but they were only found in limited molecules and therefore disappeared through dimension reduction. Our calculations suggested that more specific and diverse fingerprints were essential and important for toxicity QSAR modeling.
Conclusions
In this study, on the basis of a comprehensive data set of rat oral acute toxicity, the relationships between eight important molecular properties and acute toxicity were examined. We observed that rat oral toxicity could not be reliably predicted by a single or several molecular properties. Then, seven machine learning approaches were used to establish the QSAR models for oral acute toxicity. Considering the overall prediction accuracy for the test set, the RF and RVM methods outperformed the others. The consensus model by integrating the outputs from multiple individual models demonstrated better predictivity (\(q_{ext}^{2}\) = 0.669–0.689) than any individual model for the test set. Our study also demonstrated that QSAR modeling based on structure fingerprints could afford potential important substructural fragments as toxicity alerts, but a proper and enough large fingerprint dictionary should be adopted. By scaffold analysis, we found that quite limited numbers of molecules with certain scaffolds in the training set would reduce the prediction accuracy of the models. According to the results of this study, we believed that the successful modeling methods used here could be employed for other toxicity endpoints.
Declarations
Authors’ contributions
TL and TH conceived and designed the experiments. TL and DL performed the simulations. TL, YS, DL, HS and YL analyzed the data. TL, HS, YL and TH wrote the manuscript. All authors read and approved the manuscript.
Acknowledgements
This study was supported by the National Science Foundation of China (21575128, 81302679), the “Construction of Shanghai Municipal NCB medical rescue system” Project of Shanghai Municipal Commission of Health and Family Planning, and the Special Program for National Basic Work on Science and Technology (2015FY111400) of Ministry of Science and Technology of China. We would like to thank Dr. Hao Zhu and Alexander Tropsha for valuable dataset of rat oral LD_{50}.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Parasuraman S (2011) Toxicological screening. J Pharmacol Pharmacother 2(2):74View ArticleGoogle Scholar
 Nicolotti O, Benfenati E, Carotti A, Gadaleta D, Gissi A, Mangiatordi GF, Novellino E (2014) REACH and in silico methods: an attractive opportunity for medicinal chemists. Drug Discov Today 19(11):1757–1768View ArticleGoogle Scholar
 Benz RD (2007) Toxicological and clinical computational analysis and the US FDA/CDER. Expert Opin Drug Metab Toxicol 3(1):109–124View ArticleGoogle Scholar
 Creton S, Dewhurst IC, Earl LK, Gehen SC, Guest RL, Hotchkiss JA, Indans I, Woolhiser MR, Billington R (2009) Acute toxicity testing of chemicals—opportunities to avoid redundant testing and use alternative approaches. Crit Rev Toxicol 40(1):50–83View ArticleGoogle Scholar
 Cheng F, Li W, Liu G, Tang Y (2013) In silico ADMET prediction: recent advances, current challenges and future trends. Curr Top Med Chem 13(11):1273–1289View ArticleGoogle Scholar
 Merlot C (2010) Computational toxicology—a tool for early safety evaluation. Drug Discov Today 15(1–2):16–22View ArticleGoogle Scholar
 Kruhlak NL, Benz RD, Zhou H, Colatsky TJ (2012) (Q)SAR modeling and safety assessment in regulatory review. Clin Pharmacol Ther 91(3):529–534View ArticleGoogle Scholar
 Zhu H, Zhang J, Kim MT, Boison A, Sedykh A, Moran K (2014) Big data in chemical toxicity research: the use of highthroughput screening assays to identify potential toxicants. Chem Res Toxicol 27(10):1643–1651View ArticleGoogle Scholar
 Diaza RG, Manganelli S, Esposito A, Roncaglioni A, Manganaro A, Benfenati E (2015) Comparison of in silico tools for evaluating rat oral acute toxicity. SAR QSAR Environ Res 26(1):1–27View ArticleGoogle Scholar
 Zhu H, Martin TM, Ye L, Sedykh A, Young DM, Tropsha A (2009) Quantitative structureactivity relationship modeling of rat acute toxicity by oral exposure. Chem Res Toxicol 22(12):1913–1921View ArticleGoogle Scholar
 Xu C, Cheng F, Chen L, Du Z, Li W, Liu G, Lee PW, Tang Y (2012) In silico prediction of chemical Ames mutagenicity. J Chem Inf Model 52(11):2840–2847View ArticleGoogle Scholar
 Zang Q, Rotroff DM, Judson RS (2013) Binary classification of a large collection of environmental chemicals from estrogen receptor assays by quantitative structureactivity relationship and machine learning methods. J Chem Inf Model 53(12):3244–3261View ArticleGoogle Scholar
 Raevsky OA, Grigor’Ev VJ, Modina EA, Worth AP (2010) Prediction of acute toxicity to mice by the Arithmetic Mean Toxicity (AMT) modelling approach. SAR QSAR Environ Res 21(3–4):265–275View ArticleGoogle Scholar
 Lu J, Peng J, Wang J, Shen Q, Bi Y, Gong L, Zheng M, Luo X, Zhu W, Jiang H et al (2014) Estimation of acute oral toxicity in rat using local lazy learning. J Cheminform 6:26View ArticleGoogle Scholar
 Sheridan RP, Feuston BP, Maiorov VN, Kearsley SK (2004) Similarity to molecules in the training set is a good discriminator for prediction accuracy in QSAR. J Chem Inf Model 44(6):1912–1928View ArticleGoogle Scholar
 Discovery Studio 2.5 Guide. Accelrys Inc., San Diego, CA, USA. http://www.accelrys.com
 MOE molecular simulation package. Chemical Computing Group Inc., Montreal, Candada. http://www.chemcomp.com
 Yap CW (2011) PaDELdescriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474View ArticleGoogle Scholar
 Bura E, Cook RD (2001) Extending sliced inverse regression. J Am Stat Assoc 96(455):996–1003View ArticleGoogle Scholar
 Dittman DJ, Khoshgoftaar TM, Wald R, Napolitano A (2012) Comparing two new gene selection ensemble approaches with the commonlyused approach. In: 2012 11th International conference on machine learning and applications (ICMLA), vol 2. Boca Raton, FL, pp 184–191Google Scholar
 Varma M, Zisserman A (2009) A statistical approach to material classification using image patch exemplars. IEEE Trans Pattern Anal Mach Intell 31(11):2032–2047View ArticleGoogle Scholar
 Chan CH, Tahir MA, Kittler J, Pietikainen M (2013) Multiscale local phase quantization for robust componentbased face recognition using kernel fusion of multiple descriptors. IEEE Trans Pattern Anal Mach Intell 35(5):1164–1177View ArticleGoogle Scholar
 Gao YF, Li BQ, Cai YD, Feng KY, Li ZD, Jiang Y (2013) Prediction of active sites of enzymes by maximum relevance minimum redundancy (mRMR) feature selection. Mol BioSyst 9(1):61–69View ArticleGoogle Scholar
 Martin TM, Harten P, Young DM, Muratov EN, Golbraikh A, Zhu H, Tropsha A (2012) Does rational selection of training and test sets improve the outcome of QSAR modeling? J Chem Inf Model 52(10):2570–2578View ArticleGoogle Scholar
 Eklund M, Norinder U, Boyer S, Carlsson L (2014) Choosing feature selection and learning algorithms in QSAR. J Chem Inf Model 54(3):837–843View ArticleGoogle Scholar
 Tian S, Wang J, Li Y, Xu X, Hou T (2012) Druglikeness analysis of traditional Chinese medicines: prediction of druglikeness using machine learning approaches. Mol Pharmaceut 9(10):2875–2886View ArticleGoogle Scholar
 Chen L, Li Y, Yu H, Zhang L, Hou T (2012) Computational models for predicting substrates or inhibitors of Pglycoprotein. Drug Discov Today 17(7–8):343–351View ArticleGoogle Scholar
 Hou T, Wang J (2008) Structure–ADME relationship: Still a long way to go? Expert Opin Drug Metab Toxicol 4(6):759–770View ArticleGoogle Scholar
 Cortez P (2010) Data mining with neural networks and support vector machines using the R/rminer tool. In: Petra Perner (ed) Advances in data mining—applications and theoretical aspects, vol 6171. Springer, Berlin, pp 572–583View ArticleGoogle Scholar
 Bischl B (2015) The mlr package: machine learning in R. https://github.com/berndbischl/mlr
 Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1(3):211–244Google Scholar
 Burden FR, Winkler DA (2015) Relevance vector machines: sparse classification methods for QSAR. J Chem Inf Model 55(8):1529–1534View ArticleGoogle Scholar
 Hou T, Wang J, Li Y (2007) ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine. J Chem Inf Model 47(6):2408–2415View ArticleGoogle Scholar
 Zhou S, Li GB, Huang LY, Xie HZ, Zhao YL, Chen YZ, Li LL, Yang SY (2014) A prediction model of druginduced ototoxicity developed by an optimal support vector machine (SVM) method. Comput Biol Med 51:122–127View ArticleGoogle Scholar
 Cortes C, Vapnik V (1995) Supportvector networks. Mach Learn 20(3):273–297Google Scholar
 Cortez P (2014) Modern optimization with R. Springer, New YorkView ArticleGoogle Scholar
 Itskowitz P, Tropsha A (2005) kappa Nearest neighbors QSAR modeling as a variational problem: theory and applications. J Chem Inf Model 45(3):777–785View ArticleGoogle Scholar
 Solimeo R, Zhang J, Kim M, Sedykh A, Zhu H (2012) Predicting chemical ocular toxicity using a combinatorial QSAR approach. Chem Res Toxicol 25(12):2763–2769View ArticleGoogle Scholar
 Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 43(6):1947–1958View ArticleGoogle Scholar
 Sheridan RP (2013) Using random forest to model the domain applicability of another random forest model. J Chem Inf Model 53(11):2837–2850View ArticleGoogle Scholar
 Obrezanova O, Csanyi G, Gola JM, Segall MD (2007) Gaussian processes: a method for automatic QSAR modeling of ADME properties. J Chem Inf Model 47(5):1847–1857View ArticleGoogle Scholar
 Gramacy RB, Apley DW (2015) Local Gaussian process approximation for large computer experiments. J Comput Graph Stat 24(2):561–578View ArticleGoogle Scholar
 GonzalezArjona D, LopezPerez G, Gustavo GA (2002) Nonlinear QSAR modeling by using multilayer perceptron feedforward neural networks trained by backpropagation. Talanta 56(1):79–90View ArticleGoogle Scholar
 SpeckPlanche A, Kleandrova VV, Cordeiro MN (2013) Chemoinformatics for rational discovery of safe antibacterial drugs: simultaneous predictions of biological activity against streptococci and toxicological profiles in laboratory animals. Bioorg Med Chem 21(10):2727–2732View ArticleGoogle Scholar
 Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29(5):1189–1232View ArticleGoogle Scholar
 Singh KP, Gupta S (2014) In silico prediction of toxicity of noncongeneric industrial chemicals using ensemble learning based modeling approaches. Toxicol Appl Pharmacol 275(3):198–212View ArticleGoogle Scholar
 Eriksson L, Jaworska J, Worth AP, Cronin MT, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification and regressionbased QSARs. Environ Health Perspect 111(10):1361–1375View ArticleGoogle Scholar
 Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29(6–7):476–488View ArticleGoogle Scholar
 Weaver S, Gleeson MP (2008) The importance of the domain of applicability in QSAR modeling. J Mol Graph Model 26(8):1315–1326View ArticleGoogle Scholar
 Kaneko H, Funatsu K (2014) Applicability domain based on ensemble learning in classification and regression analysis. J Chem Inf Model 54(9):2469–2482View ArticleGoogle Scholar
 Sushko I, Novotarskyi S, Korner R, Pandey AK, Cherkasov A, Li J, Gramatica P, Hansen K, Schroeter T, Muller KR et al (2010) Applicability domains for classification problems: benchmarking of distance to models for Ames mutagenicity set. J Chem Inf Model 50(12):2094–2111View ArticleGoogle Scholar
 Sushko I, Novotarskyi S, Korner R, Pandey AK, Kovalishyn VV, Prokopenko VV, Tetko IV (2010) Applicability domain for in silico models to achieve accuracy of experimental measurements. J Chemom 24(3–4):202–208View ArticleGoogle Scholar
 Tetko IV, Sushko I, Pandey AK, Zhu H, Tropsha A, Papa E, Oberg T, Todeschini R, Fourches D, Varnek A (2008) Critical assessment of QSAR models of environmental toxicity against Tetrahymena pyriformis: focusing on applicability domain and overfitting by variable selection. J Chem Inf Model 48(9):1733–1746View ArticleGoogle Scholar
 Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39(15):2887–2893View ArticleGoogle Scholar
 Tian S, Wang J, Li Y, Li D, Xu L, Hou T (2015) The application of in silico druglikeness predictions in pharmaceutical research. Adv Drug Delivery Rev 86:2–10View ArticleGoogle Scholar
 Tian S, Li Y, Wang J, Xu X, Xu L, Wang X, Chen L, Hou T (2013) Druglikeness analysis of traditional Chinese medicines: 2. Characterization of scaffold architectures for druglike compounds, nondruglike compounds, and natural compounds from traditional Chinese medicines. J Cheminform 5(1):5View ArticleGoogle Scholar
 Wildman SA, Crippen GM (1999) Prediction of physicochemical parameters by atomic contributions. J Chem Inf Comput Sci 39(5):868–873View ArticleGoogle Scholar
 Serafimova R, Todorov M, Pavlov T, Kotov S, Jacob E, Aptula A, Mekenyan O (2007) Identification of the structural requirements for mutagenicity, by incorporating molecular flexibility and metabolic activation of chemicals. II. General Ames mutagenicity model. Chem Res Toxicol 20(4):662–676View ArticleGoogle Scholar
 Narayana Moorthy NSH, Sousa SF, Ramos MJ, Fernandes PA (2011) In silicobased structural analysis of arylthiophene derivatives for FTase inhibitory activity, hERG, and other toxic effects. J Biomol Screen 16(9):1037–1046View ArticleGoogle Scholar
 Moore DR, Breton RL, MacDonald DB (2003) A comparison of model performance for six quantitative structureactivity relationship packages that predict acute toxicity to fish. Environ Toxicol Chem 22(8):1799–1809View ArticleGoogle Scholar
 Wang S, Li Y, Wang J, Chen L, Zhang L, Yu H, Hou T (2012) ADMET evaluation in drug discovery 12 Development of binary classification models for prediction of hERG potassium channel blockage. Mol Pharm 9(4):996–1010View ArticleGoogle Scholar
 Wang Y, Zhao C, Ma W, Liu H, Wang T, Jiang G (2006) Quantitative structureactivity relationship for prediction of the toxicity of polybrominated diphenyl ether (PBDE) congeners. Chemosphere 64(4):515–524View ArticleGoogle Scholar
 FunarTimofei S, Ionescu D, Suzuki T (2010) A tentative quantitative structuretoxicity relationship study of benzodiazepine drugs. Toxicol In Vitro 24(1):184–200View ArticleGoogle Scholar
 Zhu J, Wang J, Yu H, Li Y, Hou T (2011) Recent developments of in silico predictions of oral bioavailability. Comb Chem High Throughput Screen 14(5):362–374View ArticleGoogle Scholar
 Hou T, Li Y, Zhang W, Wang J (2009) Recent developments of in silico predictions of intestinal absorption and oral bioavailability. Comb Chem High Throughput Screen 12(5):497–506View ArticleGoogle Scholar
 Chen B, Sheridan RP, Hornak V, Voigt JH (2012) Comparison of random forest and Pipeline Pilot Naive Bayes in prospective QSAR predictions. J Chem Inf Model 52(3):792–803View ArticleGoogle Scholar
 Eklund M, Norinder U, Boyer S, Carlsson L (2014) Choosing feature selection and learning algorithms in QSAR. J Chem Inf Model 54(3):837–843View ArticleGoogle Scholar
 Lavecchia A (2015) Machinelearning approaches in drug discovery: methods and applications. Drug Discov Today 20(3):318–331View ArticleGoogle Scholar
 Lei B, Li J, Yao X (2013) A novel strategy of structural similarity based consensus modeling. Mol Inform 32(7):599–608View ArticleGoogle Scholar
 Lei B, Xi L, Li J, Liu H, Yao X (2009) Global, local and novel consensus quantitative structureactivity relationship studies of 4(phenylaminomethylene) isoquinoline1, 3 (2H, 4H)diones as potent inhibitors of the cyclindependent kinase 4. Anal Chim Acta 644(1):17–24View ArticleGoogle Scholar
 Li J, Lei B, Liu H, Li S, Yao X, Liu M, Gramatica P (2008) QSAR study of malonylCoA decarboxylase inhibitors using GAMLR and a new strategy of consensus modeling. J Comput Chem 29(16):2636–2647View ArticleGoogle Scholar
 Cortez P, Embrechts MJ (2013) Using sensitivity analysis and visualization techniques to open black box data mining models. Inform Sci (N Y) 225:1–17View ArticleGoogle Scholar
 Oh DS, Troester MA, Usary J, Hu Z, He X, Fan C, Wu J, Carey LA, Perou CM (2006) Estrogenregulated genes predict survival in hormone receptorpositive breast cancers. J Clin Oncol 24(11):1656–1664View ArticleGoogle Scholar
 Li X, Chen L, Cheng F, Wu Z, Bian H, Xu C, Li W, Liu G, Shen X, Tang Y (2014) In silico prediction of chemical acute oral toxicity using multiclassification methods. J Chem Inf Model 54(4):1061–1069View ArticleGoogle Scholar
 Bhhatarai B, Gramatica P (2011) Oral LD50 toxicity modeling and prediction of per and polyfluorinated chemicals on rat and mouse. Mol Divers 15(2):467–476View ArticleGoogle Scholar
 Andrzejewska M, YepezMulia L, CedilloRivera R, Tapia A, Vilpo L, Vilpo J, Kazimierczuk Z (2002) Synthesis, antiprotozoal and anticancer activity of substituted 2trifluoromethyl and 2pentafluoroethylbenzimidazoles. Eur J Med Chem 37(12):973–978View ArticleGoogle Scholar
 Kazimierczuk Z, Andrzejewska M, Kaustova J, Klimesova V (2005) Synthesis and antimycobacterial activity of 2substituted halogenobenzimidazoles. Eur J Med Chem 40(2):203–208View ArticleGoogle Scholar
 NavarreteVazquez G, RojanoVilchis MM, YepezMulia L, Melendez V, Gerena L, HernandezCampos A, Castillo R, HernandezLuis F (2006) Synthesis and antiprotozoal activity of some 2(trifluoromethyl)1Hbenzimidazole bioisosteres. Eur J Med Chem 41(1):135–141View ArticleGoogle Scholar
 PerezVillanueva J, Santos R, HernandezCampos A, Giulianotti MA, Castillo R, MedinaFranco JL (2011) Structure–activity relationships of benzimidazole derivatives as antiparasitic agents: dual activitydifference (DAD) maps. MedChemComm 2(1):44–49View ArticleGoogle Scholar
 Paterno A, D’Anna F, Musumarra G, Noto R, Scire S (2014) A multivariate insight into ionic liquids toxicities. RSC Adv 4(46):23985–24000View ArticleGoogle Scholar