Skip to main content

Table 5 Summary statistics for the five best-performing DNN models

From: Open-source QSAR models for pKa prediction using multiple machine learning approaches

Data option

Dataset

Feature sets

Number of features

Train

Fivefold CV

Test

R2

RMSE

Q2

RMSE

R2

RMSE

1

Acidic

Continuous + MACCS

1408

0.98

0.43

0.75

1.71

0.80

1.51

2

Acidic

Continuous + MACCS

1408

0.98

0.52

0.74

1.73

0.79

1.54

2

Acidic

Fingerprints

1190

0.98

0.48

0.71

1.82

0.79

1.55

1

Acidic

Fingerprints

1190

0.99

0.39

0.71

1.81

0.78

1.59

2

Acidic

MACCS

166

0.96

0.64

0.71

1.82

0.77

1.61

2

Basic

Fingerprints

1190

0.98

0.48

0.75

1.63

0.77

1.57

1

Basic

MACCS

166

0.97

0.53

0.74

1.69

0.77

1.59

1

Basic

Continuous + MACCS

1481

0.98

0.45

0.75

1.64

0.76

1.59

2

Basic

Continuous + MACCS

1481

0.97

0.56

0.73

1.71

0.76

1.60

2

Basic

MACCS

166

0.97

0.58

0.75

1.65

0.74

1.65

1

Combined

Continuous + MACCS

1408

0.97

0.52

0.65

1.90

0.75

1.61

1

Combined

Fingerprints

1190

0.97

0.55

0.62

1.98

0.73

1.68

2

Combined

Continuous + MACCS

1408

0.97

0.55

0.67

1.84

0.72

1.69

1

Combined

MACCS

166

0.97

0.57

0.62

1.99

0.72

1.70

2

Combined

MACCS

166

0.97

0.52

0.63

1.94

0.70

1.76

  1. Statistics are presented for the acidic only, basic only and combined (acidic and basic) data sets. Each group of statistics is ordered by test set RMSE, with the best-performing models listed first