Skip to main content

Table 15 Results on dataset 3 and dataset 2 + 3

From: DECIMER 1.0: deep learning for chemical image recognition using transformers

Metrics

Augmented dataset (3)

Pre-trained model + augmented dataset (2 + 3)

Non augmented test set

Augmented test set

Non augmented test set

Augmented test set

Train data size

33,304,320

33,304,320

33,304,320

33,304,320

Test data size

2,000,000

2,000,000

2,000,000

2,000,000

Tanimoto

0.9663

0.9501

0.9708

0.9521

Tanimoto 1.0

86.43%

80.26%

88.04%

80.87%

Isomorphic Predictions

97.89%

97.46%

98.15%

97.61%

Non isomorphic predictions

2.11%

2.54%

1.85%

2.39%