Skip to main content

Table 4 Comparing the encoder–decoder- and transformer-based approach with a 1 million images test dataset

From: DECIMER 1.0: deep learning for chemical image recognition using transformers

Metrics

Encoder–decoder

Transformer

InceptionV3

EfficientNet-B3

InceptionV3

EfficientNet-B3

Average training time per epoch

7 min 34 s

8 min 57 s

8 min 33 s

9 min 27 s

Tanimoto

0.5459

0.6345

0.8764

0.9371

Tanimoto 1.0

1.41%

7.03%

55.29%

74.57%