From: STOUT: SMILES to IUPAC names using neural machine translation
No. | SELFIES | BLEU Score | SELFIES decoded back into SMILES | Tanimoto similarity Index | ||
---|---|---|---|---|---|---|
Original | Predicted | Original | Predicted | |||
1. | [I][C][C][Branch1_2][Branch1_3][=C][N][C][Expl=Ring1][Branch1_1][C][C] | [I][C][=C][Branch1_1][Branch1_3][N][C][=C][Ring1][Branch1_1][C][C]: | 0.00 | IC=1C(=CNC1C)C | IC=1C(=CNC1C)C | 1.0 |
2. | [O][C][C][=C][C][=C][Branch1_1][Ring2][C][Expl=Ring1][Branch1_2][C][N][=N][C][=C][Branch1_1][Ring2][C][Expl=Ring1][Branch1_2][C] | [O][C][=C][C][=C][C][Branch1_2][Ring2][=C][Ring1][Branch1_2][C][=N][N][=C][C][Branch1_2][Ring2][=C][Ring1][Branch1_2][C]: | 0.18 | OC=1C=CC=C(C1)C=2N=NC=C(C2)C | OC=1C=CC=C(C1)C=2N=NC=C(C2)C | 1.0 |
3. | [C][Branch1_2][=C][=C][C][C][Branch1_2][Branch2_1][=C][C][Branch1_2][Ring1][=C][C][C][C][C] | [C][Branch1_1][=N][C][=C][Branch1_1][Branch1_3][C][=C][Branch1_1][C][C][C][C][C][=C][C]: | 0.21 | C(=CCC(=CC(=CC)C)C)C | C(C=C(C=C(C)C)CC)=CC | 1.0 |
4. | [N][=C][C][Branch1_2][N][=C][C][=C][Ring1][Branch1_2][O][C][Branch1_1][C][C][C][C][=N][N][C][C][=C][C][Branch1_1][Ring2][N][=C][N][=C][C][Ring1][N][Expl=Ring1][Branch2_2] | [N][Branch1_2][Ring1][=C][N][C][C][=C][C][N][N][=C][Branch1_1][P][C][C][=N][C][Branch1_1][Branch1_3][O][C][Branch1_1][C][C][C][=C][C][Expl=Ring1][Branch2_3][C][Expl=Ring1][#C][C][Expl=Ring2][Ring1][Ring1]: | 0.32 | N1=CC(=CC=C1OC(C)C)C2=NNC=3C=CC(N=CN)=CC23 | N1=CC(=CC=C1OC(C)C)C2=NNC=3C=CC(N=CN)=CC23 | 1.0 |
5. | [O][=C][N][C][=C][Branch1_1][Branch1_2][N][=C][Ring1][Branch1_2][C][C][=C][C][=C][C][Ring1][Branch2_3] | [O][=C][N][C][C][=C][C][=C][C][C][Expl=Ring1][Branch1_3][N][=C][Ring1][O][C]: | 0.45 | O=C1NC2=C(N=C1C)C=CC=CC2 | O=C1NC=2C=CC=CCC2N=C1C | 1.0 |
6. | [O][=N][C][Branch1_2][C][=O][C][C][=C][C][=C][C][Branch1_2][Branch2_2][=C][C][=C][Ring1][Branch1_2][C][Expl=Ring1][Branch2_3][C] | [O][=N][C][Branch1_2][C][=O][C][=C][C][=C][C][=C][Branch1_1][Branch2_2][C][=C][C][Ring1][Branch1_2][=C][Ring1][Branch2_3][C]: | 0.53 | O=NC(=O)C=1C=CC2=CC(=CC=C2C1)C | O=NC(=O)C=1C=CC2=CC(=CC=C2C1)C | 1.0 |
7. | [O][B][Branch1_1][C][O][C][=C][C][Branch1_2][=C][=C][C][=C][Ring1][Branch1_2][C][=C][C][=C][C][=C][Ring1][Branch1_2][C][=C][N][=C][C][=C][Ring1][Branch1_2] | [O][B][Branch1_1][C][O][C][C][=C][Branch1_1][=C][C][=C][C][Expl=Ring1][Branch1_2][C][C][=C][C][=C][C][Expl=Ring1][Branch1_2][C][=C][N][=C][C][=C][Ring1][Branch1_2]: | 0.60 | OB(O)C1=CC(=CC=C1C2=CC=CC=C2)C3=CN=CC=C3 | OB(O)C1=CC(=CC=C1C2=CC=CC=C2)C3=CN=CC=C3 | 1.0 |
8. | [O][=C][N][C][Branch2_1][Ring1][C][C][C][=C][C][Branch1_1][Ring1][O][C][=C][C][Expl=Ring1][Branch2_1][N][Branch1_1][C][C][C][=C][Branch1_1][Branch1_2][C][=C][Ring1][P][C][C][C] | [O][=C][N][C][Branch2_1][Ring1][C][C][=C][C][=C][Branch1_1][Ring1][O][C][C][=C][Ring1][Branch2_1][N][Branch1_1][C][C][C][=C][Branch1_1][Branch1_3][C][=C][Ring1][P][C][C][C]: | 0.71 | O=C1NC(C=2C=CC(OC)=CC2N(C)C)=C(C=C1C)CC | O=C1NC(C=2C=CC(OC)=CC2N(C)C)=C(C=C1CC)C | 1.0 |
9. | [O][=P][Branch2_1][Ring1][Branch1_2][C][=N][N][C][Branch1_2][Ring2][=C][Ring1][Branch1_1][C][Branch1_1][C][F][Branch1_1][C][F][C][Branch1_1][C][F][F][Branch1_1][Branch2_2][C][C][=C][C][=C][C][Expl=Ring1][Branch1_2][C][C][=C][C][=C][C][Expl=Ring1][Branch1_2] | [O][=P][Branch1_1][Branch2_2][C][C][=C][C][=C][C][Expl=Ring1][Branch1_2][Branch1_1][Branch2_2][C][C][=C][C][=C][C][Expl=Ring1][Branch1_2][C][=N][N][C][Branch1_2][Ring2][=C][Ring1][Branch1_1][C][Branch1_1][C][F][Branch1_1][C][F][C][Branch1_1][C][F][F]: | 0.86 | O=P(C1=NNC(=C1)C(F)(F)C(F)F)(C=2C=CC=CC2)C=3C=CC=CC3 | O=P(C1=NNC(=C1)C(F)(F)C(F)F)(C=2C=CC=CC2)C=3C=CC=CC3 | 1.0 |
10. | [O][=C][Branch2_1][Ring1][=N][O][C][=C][C][=C][C][Branch1_2][N][=C][Ring1][Branch1_2][O][C][Branch1_2][C][=O][C][C][C][C][C][C][C][C][C][C][C][C][C][C][C] | [O][=C][Branch2_1][Ring1][=C][O][C][=C][C][=C][C][Branch1_2][N][=C][Ring1][Branch1_2][O][C][Branch1_2][C][=O][C][C][C][C][C][C][C][C][C][C][C][C][C][C][C]: | 0.93 | O=C(OC1=CC=CC(=C1OC(=O)CCC)CCCCCCCCC)CCC | O=C(OC1=CC=CC(=C1OC(=O)CCC)CCCCCCCCCC)CC | 1.0 |