Skip to main content

Table 9 Results of isomorphism calculations for the subsets of dataset 1

From: DECIMER 1.0: deep learning for chemical image recognition using transformers

Metrics

Subset 1

Subset 2

Subset 3

Subset 4

Train data size

921,600

10,240,000

15,360,000

35,002,240

Test data size

102,400

1,024,000

1,536,000

3,929,093

Predictions with Tanimoto 1.0

74,176

899,941

1,398,028

3,790,273

Isomorphic predictions

98.63%

99.45%

99.59%

99.75%

Non-isomorphic predictions

1.37%

0.55%

0.41%

0.25%