Skip to main content

Table 9 AUROC, accuracy, F1, MCC precision and recall scores of the TransformerCNN models transfer learned on Ames data

From: Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition

Architecture

Training

AUROC\(\uparrow\)

Accuracy\(\uparrow\)

F1\(\uparrow\)

MCC\(\uparrow\)

Precision\(\uparrow\)

Recall\(\uparrow\)

No training

Untrained

0.564

0.507

0.668

0.055

0.991

0.504

 

Native

0.698

0.653

0.665

0.308

0.689

0.643

Encoder only

C2C

0.687

0.639

0.638

0.279

0.635

0.641

 

R2C

0.706

0.643

0.652

0.287

0.668

0.636

 

E2C

0.685

0.644

0.656

0.288

0.679

0.634

 

MC2C

0.711

0.656

0.658

0.311

0.663

0.653

 

MR2C

0.693

0.641

0.647

0.283

0.656

0.637

 

ME2C

0.738

0.676

0.681

0.352

0.692

0.670

Encoder-decoder

C2C

0.711

0.663

0.662

0.327

0.658

0.665

 

R2C

0.680

0.625

0.623

0.249

0.621

0.626

 

E2C

0.719

0.650

0.650

0.301

0.649

0.651

 

MC2C

0.715

0.643

0.647

0.287

0.655

0.640

 

MR2C

0.652

0.583

0.583

0.167

0.582

0.583

 

ME2C

0.709

0.650

0.657

0.300

0.671

0.644

  1. Values are based on the scaffold split