Skip to main content

Table 4 AUC calculated for classification sets (higher values are better)

From: Transformer-CNN: Swiss knife for QSAR modeling and interpretation

Dataset

Descriptor based methodsa

SMILES based (augm = 10)2

Transformer-CNN, no augm

Transformer-CNN, augm = 10

CDDD descriptorsb

HIV

0.82

0.78

0.81

0.83

0.74

AMES

0.86

0.88

0.86

0.89

0.86

BACE

0.88

0.89

0.89

0.91

0.9

Clintox

0.77 ± 0.03

0.76 ± 0.03

0.71 ± 0.02

0.77 ± 0.02

0.73 ± 0.02

Tox21

0.79

0.83

0.81

0.82

0.82

BBBP

0.90

0.91

0.9

0.92

0.89

JAK3

0.79 ± 0.02

0.8 ± 0.02

0.70 ± 0.02

0.78 ± 0.02

0.76 ± 0.02

BioDeg

0.92

0.93

0.91

0.93

0.92

RP AR

0.85

0.87

0.83

0.87

0.86

  1. We omitted the standard mean errors, which are 0.01 or less, for the reported values
  2. aResults from our previous study [22]. bBest performance calculated with CDDD descriptors obtained using Sml2canSml autoencoder from [27]