Skip to main content

Table 2 Costs chosen by the cross-validation for linear SVM using LIBLINEAR for the different training set sizes

From: Large-scale ligand-based predictive modelling using support vector machines

Solubility

logD

Training set size

Found cost

Training set size

Found cost

100

100,000

100

10,000

2

0.01

0.1

0.005

1000

0.05

1000

10,000,000

0.05

1000

0.05

0.1

5000

0.05

5000

0.5

0.05

0.75

0.05

0.1

10,000

0.05

10,000

100

0.05

2

0.05

0.25

20,000

0.1

20,000

0.5

0.1

0.5

0.1

1

32,096

0.1

80,000

0.5

0.1

0.5

0.1

0.75

 

160,000

0.75

 

0.5

 

0.5

 

320,000

0.75

 

0.5

 

0.5

 

1,188,343

0.5

 

0.5

 

1

  1. Note the highly variable results among the three replicates for the small dataset sizes and low variation among the replicates for the larger training set sizes