Skip to main content

Table 8 Two examples that are correctly extracted in both the test set from the literature and the test set generated by CDK

From: SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer

Items

Molecule 1

Molecule 2

The real-world image derived from the literature

View full size image

View full size image

Manual-labeled SMILES

CNC1 = CC(= NC(= N1)C2 = CC = CC = C2)N3CCC(CC3)C(= O)NCC4 = CC = CC = C4C(F)(F)F

C2 = CC1 = CC(= CC = C1N = C2)CN4C3 = NC(= NC = C3N = N4)C5 = CN(CCO)N = C5

Predicted SMILES from the real-world image

CNC1 = CC(= NC(= N1)C2 = CC = CC = C2)N3CCC(CC3)C(= O)NCC4 = CC = CC = C4C(F)(F)F

C2 = CC1 = CC(= CC = C1N = C2)CN4C3 = NC(= NC = C3N = N4)C5 = CN(CCO)N = C5

Generated image from manual-labeled SMILES by CDK

View full size image

View full size image

Predicted SMILES from the generated image

CNC1 = CC(= NC(= N1)C2 = CC = CC = C2)N3CCC(CC3)C(= O)NCC4 = CC = CC = C4C(F)(F)F

C2 = CC1 = CC(= CC = C1N = C2)CN4C3 = NC(= NC = C3N = N4)C5 = CN(CCO)N = C5