Skip to main content

Table 1 Major differences between QSAR-Co and QSAR-Co-X

From: QSAR-Co-X: an open source toolkit for multitarget QSAR modelling

No

Utility

QSAR-Co

QSAR-Co-X

Remarks

1

Feature selection

One (GA)

Two (FS and SFS)

2

Reproducibility of linear modelling

Low

High

Given the same sample size and number of descriptors, GA produces different LDA models on different runs, whereas both the FS and SFS always yield the same model

3

Diagnosis of intercollinearity among variables

Not available

Available and automatically performed

Very helpful for ascertaining the robustness of the derived linear models

4

Dataset division options

Random, Kennard-Stone, Euclidean-based

Random, pre-defined, k-MCA

Since only the random division option is fast, the other QSAR-Co options were replaced to reduce computational time

5

Automatic generation of the validation set

Not available

Available

Unlike QSAR-Co, QSAR-Co-X allows generating both the screening and validation sets

6

Statistical parameters for the validation set

Manual calculations are required

Automatic calculation

Automatic calculation allows fast selection of the models

7

Number of Box-Jenkins operators available

One (pre-defined)

Four (three pre-defined and one user-specific)

Additional and more flexible operators were added to QSAR-Co-X

8

Yc randomisation

Not available

Available

A modified form of the Y-randomisation technique that incorporates the influence of experimental elements

9

Machine-learning tools

One (RF only)

Six (kNN, SVM, RF, NB, GB, and MLP)

QSAR-Co-X affords several non-linear modelling tools

10

Number of parameters that may be altered in RF modelling

5

8

QSAR-Co-X offers more flexibility for setting up RF models

11

Comparative analysis of multiple ML methods

Not possible

Possible

Useful to decide which ML method performs best

12

Hyperparameter tuning options for ML methods

Not available

Available

Extremely useful to find optimised non-linear models

13

User specific parameter settings for building non-linear models

For RF only

For kNN, SVM, RF, NB, GB, and MLP

14

Display of ROC plots (linear modelling)

For sub-training and test sets

For sub-training, test and validation sets

15

Condition-wise prediction

Not available

Available

Useful to understand how the developed model performs against individual experimental conditions, particularly for large datasets