Skip to main content

Table 2 Hyper parameter optimization parameters for all the tested models

From: POSEIDON: Peptidic Objects SEquence-based Interaction with cellular DOmaiNs: a new database and predictor

Model

Parameters

Package

Support vector machine

Kernel: [“linear”, “poly”,”rbf”,”sigmoid”]; C: [0.5, 1.0, 1.5]; Gamma: [“scale”, “auto”]

scikit-learn

Stochastic gradient descent

Loss: “squared_error”; Penalty: [“l2″, “l1″,”elasticnet”]; Alpha: [0.00001, 0.0001, 0.001]; Learning rate: [“invscaling”, “optimal”, constant”,”adaptive”]

k-nearest neighbors

N Neighbors: [2, 3, 5, 7]; P: [1, 2]; Algorithm: [“auto”, “ball_tree”, “kd_tree”, “brute”]

decision tree

Splitter: [“best”, “random”]; Criterion: ["squared_error", "friedman_mse", "absolute_error"]; Maximum depth: [None, 3, 5, 10, 50, 100]; Minimum samples split: [2, 3, 5, 7, 10]; Minimum samples leaf: [2, 3, 5, 7, 10]; Minimum weight fraction leaf: [0.0, 0.25, 0.50]; Maximum features: ["auto", "sqrt", "log2", None]

Random forest

Number of estimators: [10, 50, 100, 250]; Criterion: ["squared_error", "friedman_mse", "absolute_error"]; Maximum depth: [3, 5, 10, 50, 100]; Minimum samples split: [2, 3, 5, 7, 10]; Minimum samples leaf: [2, 3, 5, 7, 10]

Minimum weight fraction leaf: [0.0, 0.25, 0.50]

Extreme randomized trees

Number of estimators: [10, 50, 100, 250]; Criterion: ["squared_error", "friedman_mse", "absolute_error"]; Maximum depth: [None, 3, 5, 10, 50, 100]; Minimum samples split: [2, 3, 5, 7, 10]; Minimum samples leaf: [2, 3, 5, 7, 10]; Minimum weight fraction leaf: [0.0, 0.25, 0.50]

Extreme gradient boosting

Number of estimators: [10, 50, 100, 250]; Maximum depth: [None, 3, 5, 10, 50, 100]; Maximum leaves: [None, 1, 3, 5, 10, 25]; Learning rate: [None, 0.15, 0.3, 0.46, 0.60, 0.76, 0.90]; Booster: [None, "gbtree", "gblinear", "dart"]; Alpha: [0, 1, 3, 5]; Lambda: [1, 3, 5]; Gamma: [0, 1, 3, 5]

xgboost

Deep neural network

Depth: tune.qrandint(1, 10); Layer size: tune.qrandint(100, 1500, 100); Use dropout: tune.grid_search([True, False]); Dropout rate: tune.quniform(0.1, 0.9, q = 0.1); Epochs: tune.qrandint(100, 1000, 10); Learning rate: tune.quniform(0.00001, 0.001, q = 0.00001)

tensorflow

Forked neural network

Depth: tune.qrandint(1, 10); Dropout: tune.quniform(0.1, 0.9, q = 0.1); Use dropout: tune.grid_search([True, False]); Learning Rate: tune.quniform(0.00001, 0.001, q = 0.00001); Experimental layer size: tune.qrandint(5, 50); Cargo layer size: tune.randint(25, 250); Sequence anomalies layer size: tune.qrandint(5, 200); Whole-peptide features layer size: tune.qrandint(10, 300); Sequence encoding layer size: tune.qrandint(100, 1000); Genomics layer size: tune.qrandint(100, 750); Anomalous position layer size: tune.qrandint(5, 50); Epochs: tune.qrandint(100, 1000, 10)