Skip to main content

Table 1 Dataset sizes used in this work with corresponding computing times

From: DECIMER: towards deep learning for chemical image recognition

Dataset index

Train data size

Test data size

Avrg. time/epoch (s)

Time for 25 epochs (s)

1

54,000

6000

94.32

2358

2

90,000

10,000

159.88

3997

3

450,000

50,000

880.6

22,015

4

900,000

100,000

2831.8

70,795

5

1,800,000

200,000

7239.28

180,982

6

2,700,000

300,000

11,964.72

299,118

7

4,050,000

450,000

17,495.12

437,378

8

5,850,000

650,000

25,702

642,550

9

7,200,000

800,000

32,926.8

823,170

10

8,969,751

996,639

41,652.24

1,041,306

11

12,600,000

1,400,000

64,909.28

1,622,732

12

15,102,000

1,678,000

91,880.84

2,297,021

  1. The time for training the model with 15 million structures corresponds to approximately a month on a single Tesla V100 GPU