Skip to main content

Table 15 Results on dataset 3 and dataset 2 + 3

From: DECIMER 1.0: deep learning for chemical image recognition using transformers

Metrics Augmented dataset (3) Pre-trained model + augmented dataset (2 + 3)
Non augmented test set Augmented test set Non augmented test set Augmented test set
Train data size 33,304,320 33,304,320 33,304,320 33,304,320
Test data size 2,000,000 2,000,000 2,000,000 2,000,000
Tanimoto 0.9663 0.9501 0.9708 0.9521
Tanimoto 1.0 86.43% 80.26% 88.04% 80.87%
Isomorphic Predictions 97.89% 97.46% 98.15% 97.61%
Non isomorphic predictions 2.11% 2.54% 1.85% 2.39%