Skip to main content

Table 3 The supported algorithms and related parameters

From: ChemSAR: an online pipelining platform for molecular SAR modeling

Algorithms

Parameters

Recommended parameters

RandomForest

n_estimators: The number of trees in the forest;

max_features: The number of features to consider when looking for the best split; (start_feature, end_feature and step make up the attempts of max_features)

cv: cross-validation fold

n_estimators:500;

max_features: sqrt(N);

N stands for number of features;

cv: 5

SVM

kernel type: rbf, sigmoid, poly, linear;

C: penalty parameter C of the error term.;

gamma: kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’;

degree: degree of the polynomial kernel function;

cv: cross validation fold

C: 2^−5, 2^15, 2^2; (format: start, end, step)

gamma: 2^(−15), 2^3, 2^2;

degree: 1, 7, 2

cv: 5

Naïve Bayes

Bayes classifier type;

cv: cross validation fold

BernoulliNB for binary-valued variable;

GaussianNB for continuous variable;

cv: 5

K Neighbors

n_neighbors: number of neighbors to use;

cv: cross validation fold

n_neighbors: 1–10;

cv: 5

DecisionTree

Algorithm: algorithm used to compute the nearest neighbors (‘ball_tree’, ‘kd_tree’, ‘brute’);

cv: cross validation fold

Automatic decision;

cv: 5