Skip to main content

Table 1 Training and validation set sizes for the different benchmarks

From: Randomized SMILES strings improve the quality of molecular generative models

ModelTraining set sizeValidation set size
GDB-13 1M1,000,00010,000
GDB-13 10K10,0001000
GDB-13 1K10001000
ChEMBL1,483,94378,102
  1. Notice that depending on the expected size of the target chemical space and the total amount of molecules, different ratios have been used