From: DECIMER 1.0: deep learning for chemical image recognition using transformers
Metrics | Augmented dataset (3) | Pre-trained model + augmented dataset (2 + 3) | ||
---|---|---|---|---|
Non augmented test set | Augmented test set | Non augmented test set | Augmented test set | |
Train data size | 33,304,320 | 33,304,320 | 33,304,320 | 33,304,320 |
Test data size | 2,000,000 | 2,000,000 | 2,000,000 | 2,000,000 |
Tanimoto | 0.9663 | 0.9501 | 0.9708 | 0.9521 |
Tanimoto 1.0 | 86.43% | 80.26% | 88.04% | 80.87% |
Isomorphic Predictions | 97.89% | 97.46% | 98.15% | 97.61% |
Non isomorphic predictions | 2.11% | 2.54% | 1.85% | 2.39% |