Skip to main content

Table 3 Coefficient of determination, r2, calculated for regression sets (higher values are better)

From: Transformer-CNN: Swiss knife for QSAR modeling and interpretation

Dataset

Descriptor based methods2

SMILES based (augm = 10)a

Transformer-CNN, no augm

Transformer-CNN, augm = 10

CDDD descriptorsb

MP

0.83

0.85

0.83

0.86

0.85

BP

0.98

0.98

0.97

0.98

0.98

BCF

0.85

0.85

0.71 ± 0.02

0.85

0.81

FreeSolv

0.94

0.93

0.72 ± 0.02

0.91

0.93

LogS

0.92

0.92

0.85

0.91

0.91

Lipo

0.7

0.72

0.6

0.73

0.74

BACE

0.73

0.72

0.66

0.76

0.75

DHFR

0.62 ± 0.03

0.63 ± 0.03

0.46 ± 0.03

0.67 ± 0.03

0.61 ± 0.03

LEL

0.19 ± 0.04

0.25 ± 0.03

0.2 ± 0.03

0.27 ± 0.04

0.23 ± 0.04

  1. We omitted the standard mean errors, which are 0.01 or less, for the reported values
  2. aResults from our previous study [22]. bBest performance calculated with CDDD descriptors obtained using autoencoder Sml2canSml from [27]