Skip to main content

Table 1 Dataset sizes used in this work with corresponding computing times

From: DECIMER: towards deep learning for chemical image recognition

Dataset index Train data size Test data size Avrg. time/epoch (s) Time for 25 epochs (s)
1 54,000 6000 94.32 2358
2 90,000 10,000 159.88 3997
3 450,000 50,000 880.6 22,015
4 900,000 100,000 2831.8 70,795
5 1,800,000 200,000 7239.28 180,982
6 2,700,000 300,000 11,964.72 299,118
7 4,050,000 450,000 17,495.12 437,378
8 5,850,000 650,000 25,702 642,550
9 7,200,000 800,000 32,926.8 823,170
10 8,969,751 996,639 41,652.24 1,041,306
11 12,600,000 1,400,000 64,909.28 1,622,732
12 15,102,000 1,678,000 91,880.84 2,297,021
  1. The time for training the model with 15 million structures corresponds to approximately a month on a single Tesla V100 GPU