The influence of the inactives subset generation on the performance of machine learning methods

Table 1 Machine learning methods used in the experiments with the optional abbreviations used in further work

Classifier	Classification scheme	Parameters
Naïve Bayes (NB)	bayes	-
Sequential Minimal Optimization (SMO)	functions	The complexity parameter was set at 1, the epsilon for a round-off error was 1.0 E-12, and an option of normalizing training data was chosen.
		Kernels:
		1) The normalized polynomial kernel,
		2) The polynomial kernel
		3) The RBF kernel
Instance-Based Learning (Ibk)	lazy	The brute force search algorithm for nearest neighbour search with Euclidean distance function.
		The number of neighbours used:
		1) 1
		2) 5
		3) 10
		4) 20
Decorate	meta	One artificial example used during training, number of member classifiers in the Decorate ensemble: 10, the maximum number of iterations: 10.
		Base classifiers:
		1) NaïveBayes
		2) J48
Hyperpipes	misc	-
J48	trees	1) With reduced-error pruning
J48	trees	2) With C.4.5 pruning
Random Forest (RF)	trees	Trees with unlimited depth, seed number: 1.
		Number of generated trees:
		1) 5
		2) 10
		3) 50
		4) 100

Bolded parameters correspond with the one providing the best results for particular machine learning method (see Results section & Additional file 1: Figure S1).

ISSN: 1758-2946