Skip to main content

Table 7 Comparison of the different molecule representations: SMILES, SELFIES, and DeepSMILE

From: Probabilistic generative transformer language models for generative design of molecules

Tokenizer

Atom-level

SMILES

C O c 1 c c c c c 1 O C ( = O ) O c 1 c c c c c 1 O C

DeepSMILES

C O c c c c c c 6 O C = O ) O c c c c c c 6 O C

SELFIES

[C] [N] [C] [Branch1] [C] [P] [C] [C] [Ring1] [=Branch1]

Tokenizer

SmilesPE

SMILES

COc1ccccc1 O C(=O)O c1ccccc1 OC

DeepSMILES

CO cccc cc 6 OC =O) O cccc cc 6 OC

  1. Bold value indicates the best performance of samples generated by different models under the same evaluation metric