Skip to main content

Table 3 Performance comparison of the GMT models trained with 20%, 50%, and 100% training samples

From: Probabilistic generative transformer language models for generative design of molecules

Training samples

 

20%

50%

100%

Valid

\(\uparrow\)

1.0000

1.0000

1.0000

Unique@1000

\(\uparrow\)

1.0000

1.0000

1.0000

Unique@10000

\(\uparrow\)

1.0000

0.9998

1.0000

FCD/Test

\(\downarrow\)

4.3961

3.9164

3.7750

SNN/Test

\(\uparrow\)

0.4526

0.4573

0.4673

Frag/Test

\(\uparrow\)

0.9840

0.9850

0.9869

Scaf/Test

\(\uparrow\)

0.8225

0.8049

0.8431

FCD/TestSF

\(\downarrow\)

5.2401

4.7000

4.5698

SNN/TestSF

\(\uparrow\)

0.4362

0.4395

0.4485

Frag/TestSF

\(\uparrow\)

0.9792

0.9802

0.9831

Scaf/TestSF

\(\uparrow\)

0.1340

0.1461

0.1096

IntDiv

\(\uparrow\)

0.8707

0.8704

0.8701

IntDiv2

\(\uparrow\)

0.8653

0.8650

0.8646

Filters

\(\uparrow\)

0.7858

0.7913

0.7961

Novelty

\(\uparrow\)

0.9790

0.9751

0.9683

  1. Bold value indicates the best performance of samples generated by different models under the same evaluation metric