Skip to main content

Table 2 Dataset

From: Transformer-based molecular optimization beyond matched molecular pairs

Datasets

Training (2000-2017)

Validation (2018)

Test (2019-2020)

MMPs

2,287,588

143,978

166,582

Similarity (\(\ge\)0.5)

6,543,684

418,180

475,070

Similarity ([0.5,0.7))

4,543,472

286,682

327,606

Similarity (\(\ge\)0.7)

2,000,212

131,498

147,464

Scaffold

2,850,180

171,914

199,786

Scaffold generic

4,127,058

255,580

289,034