From: ChemSAR: an online pipelining platform for molecular SAR modeling
Algorithms | Parameters | Recommended parameters |
---|---|---|
RandomForest | n_estimators: The number of trees in the forest; max_features: The number of features to consider when looking for the best split; (start_feature, end_feature and step make up the attempts of max_features) cv: cross-validation fold | n_estimators:500; max_features: sqrt(N); N stands for number of features; cv: 5 |
SVM | kernel type: rbf, sigmoid, poly, linear; C: penalty parameter C of the error term.; gamma: kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’; degree: degree of the polynomial kernel function; cv: cross validation fold | C: 2^−5, 2^15, 2^2; (format: start, end, step) gamma: 2^(−15), 2^3, 2^2; degree: 1, 7, 2 cv: 5 |
Naïve Bayes | Bayes classifier type; cv: cross validation fold | BernoulliNB for binary-valued variable; GaussianNB for continuous variable; cv: 5 |
K Neighbors | n_neighbors: number of neighbors to use; cv: cross validation fold | n_neighbors: 1–10; cv: 5 |
DecisionTree | Algorithm: algorithm used to compute the nearest neighbors (‘ball_tree’, ‘kd_tree’, ‘brute’); cv: cross validation fold | Automatic decision; cv: 5 |