Skip to main content

Table 3 Coefficient of determination, r2, calculated for regression sets (higher values are better)

From: Transformer-CNN: Swiss knife for QSAR modeling and interpretation

Dataset Descriptor based methods2 SMILES based (augm = 10)a Transformer-CNN, no augm Transformer-CNN, augm = 10 CDDD descriptorsb
MP 0.83 0.85 0.83 0.86 0.85
BP 0.98 0.98 0.97 0.98 0.98
BCF 0.85 0.85 0.71 ± 0.02 0.85 0.81
FreeSolv 0.94 0.93 0.72 ± 0.02 0.91 0.93
LogS 0.92 0.92 0.85 0.91 0.91
Lipo 0.7 0.72 0.6 0.73 0.74
BACE 0.73 0.72 0.66 0.76 0.75
DHFR 0.62 ± 0.03 0.63 ± 0.03 0.46 ± 0.03 0.67 ± 0.03 0.61 ± 0.03
LEL 0.19 ± 0.04 0.25 ± 0.03 0.2 ± 0.03 0.27 ± 0.04 0.23 ± 0.04
  1. We omitted the standard mean errors, which are 0.01 or less, for the reported values
  2. aResults from our previous study [22]. bBest performance calculated with CDDD descriptors obtained using autoencoder Sml2canSml from [27]