From: DECIMER 1.0: deep learning for chemical image recognition using transformers
| Metrics | Augmented dataset (3) | Pre-trained model + augmented dataset (2 + 3) | ||
|---|---|---|---|---|
| Non augmented test set | Augmented test set | Non augmented test set | Augmented test set | |
| Train data size | 33,304,320 | 33,304,320 | 33,304,320 | 33,304,320 |
| Test data size | 2,000,000 | 2,000,000 | 2,000,000 | 2,000,000 |
| Tanimoto | 0.9663 | 0.9501 | 0.9708 | 0.9521 |
| Tanimoto 1.0 | 86.43% | 80.26% | 88.04% | 80.87% |
| Isomorphic Predictions | 97.89% | 97.46% | 98.15% | 97.61% |
| Non isomorphic predictions | 2.11% | 2.54% | 1.85% | 2.39% |