Skip to main content

Table 1 Model training details

From: Using test-time augmentation to investigate explainable AI: inconsistencies between method, model and human intuition

 

Training

Parameter

Pre-training

Transfer learning

Batch size

128

128

Learning rate

\(10^{-4}\)

\(5\times 10^{-5}\)

Weight decay

0.01

0.01

Dropout

0.1

0.3

Initialization

Xavier

Xavier

Optimizer

AdamW

AdamW

Scheduler

None

None

Frozen encoder

No

Yes

Max sequence length

175

175

Token space

68

68

Embedding dimension

512

512