Skip to main content

Table 3 Comparisons of different methods on the eSOL test dataset

From: Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map

Models

RMSE

\({\text{R}}^{2}\)

Accuracy

Precision

Recall

F1

AUC

K-nearest neighbor

0.284

0.214

0.691

0.737

0.486

0.586

0.776

Linear regression

0.280

0.240

0.707

0.685

0.642

0.663

0.777

Random forest

0.255

0.370

0.760

0.750

0.690

0.729

0.825

Protein-Sol

0.253

0.376

0.714

0.689

0.688

0.693

0.808

XGboost

0.252

0.385

0.756

0.748

0.690

0.718

0.829

Support vector machine

0.246

0.411

0.761

0.763

0.684

0.721

0.842

DeepSol

0.241

0.434

0.763

0.771

0.738

0.695

0.845

ProGAN

0.237

0.442

0.763

0.770

0.676

0.720

0.853

SeqVec

0.236

0.458

0.767

0.754

0.715

0.734

0.858

TAPE

0.235

0.461

0.764

0.774

0.710

0.730

0.856

LSTM (All node features)

0.236

0.458

0.765

0.748

0.677

0.730

0.855

GraphSol (No contact)

0.235

0.462

0.763

0.710

0.676

0.729

0.853

GraphSol

0.231

0.483

0.779

0.775

0.693

0.732

0.866

GraphSol (Ensemble)

0.227

0.501

0.782

0.790

0.702

0.743

0.873

  1. Italic values indicate the performance of our purposed model
  2. Bold italic values indicate the performance of our ensemble model by using all folds of models to make a final prediction