Skip to main content

Table 5 The reported classification models for BCRP inhibitors and non-inhibitors

From: ADMET evaluation in drug discovery. 20. Prediction of breast cancer resistance protein inhibition through machine learning

YearData sizeData setMethodDescriptorsModel validationStatistical resultsRefs.
TrainingTest
20071238043OPLS-DADescriptors from SELMA software packageY-randGATE = 0.79Matsson et al. [16]
20091228339PLS-DADescriptors from DragonX version 3.0Y-randaNAMatsson et al. [115]
20131093079Pharmacophore modelingNANAMCCTE = 0.29, GATE = 0.66Pan et al. [11]
201320312479NBECFP_6, FCFP_6 fingerprintsLOO CVAUCTR(LOO CV) = 0.795, MCCTE = 0.69Pan et al. [11]
2013382382NASVM, k-NN, RF, and consensus modelingDragon, MOE descriptorsFivefold CV, Y-randBATR(fivefold cv) = 0.83 ± 0.04 (Consensus)Sedykh et al. [121]
201427596Test: 32, external set: 147ensembles of ANN, ensembles of SVMDescriptors from ADMET ModelerNAGATE = 0.87, GAExternal = 0.67 (ensembles of ANN)Eric et al. [122]
2014780780NANBECFP_6 fingerprintsTenfold CVGATR(tenfold CV) = 0.919, AUCTR(tenfold cv) = 0.854Montanari et al. [20]
2015394197Test: 99, external set: 98SVM, k-NN, ANN, and Consensus ModelingDragon descriptorsNAGATE = 0.878, MCCTE = 0.73; GAExternal = 0.745, MCCExternal = 0.46 (ANN)Belekar et al. [21]
2016aNANANAGTM-kNNd, GTM-Bayes, RF, SVM, and k-NNMOE descriptorsFivefold CV with five repetitionsNAGimadiev et al. [123]
2017978978NANB, LR, SVM, and RFMACCS, Morgan, ECFP8 fingerprints, VolSurf descriptorsTenfold CV, leave-sources-out validationMCCTR(tenfold CV) = 0.65, AUCTR(tenfold CV) = 0.90 (LR)Montanari et al. [22]
201927992240559NB, LR, SVM, k-NN, XGBoost, SGB, DNN and consensus modelingMOE descriptors and Pubchem fingerprintsFivefold CVMCCTE = 0.812, AUCTE = 0.958, GATE = 0.911, BATE = 0.905 (SVM)This study
  1. Mean ± st.dev across fivefold CV
  2. TR training set, TE test set, OPLS-DA orthogonal partial least-squares projection to latent structures discriminant analysis, NA not available, GA global accuracy, Y-Rand Y-Randomization test, PLS-DA partial least-squares projection to latent structures discriminant analysis, NB Naive Bayes, LOO CV leave-one-out cross-validation, AUC the area under the receiver operating characteristic curve, MCC Matthews correlation coefficient, SVM support vector machine, k-NN k-nearest neighbors, RF random forest, CV cross-validation, BA balanced accuracy, ANN artificial neural networks, GTM generative topographic mapping, LR logistic regression
  3. There are many models developed based on different methods or descriptors, and we only extracted the best statistical results for the test set or cross-validation
  4. aThe exact values are not available in the publication