Skip to main content

Table 2 Datasets used for training with the number of molecules per class and the overall dataset size, each conformation is counted as separate molecule

From: COVER: conformational oversampling as data augmentation for molecules

Dataset No. of conformations per No. of molecules
Inactive Active Inactive Active Overall
1-1 1 1 5502 341 5843
1-16 1 16 5502 5428 10,930
2-2 2 2 11,001 680 11,681
2-32 2 32 11,001 10,865 21,866
5-5 5 5 27,504 1698 29,202
5-80 5 80 27,504 27,145 54,649