Table 1 Monomers correctly suggested by rBAN

From: rBAN: retro-biosynthetic analysis of nonribosomal peptides

Norine code PubChemID IUPAC name Structure Compounds Reason of the missing monomer Refs.
NFo-Lys 12679627 6-amino-2-formamidohexanoic acid NOR00261, NOR00262, NOR00263, NOR00264 NOR00266, NOR00267, NOR00269, NOR00270 NOR00271, NOR00272, NOR00274, NOR00275 NOR00276, NOR00277, NOR00278, NOR00580 “CO” monomer in graphs [32]
D-3OMe-Ala 97963 2-amino-3-methoxypropanoic acid NOR00422, NOR00423, NOR00424, NOR00425 NOR00588 Wrong SMILES of D-3OMe-Ala monomer [33]
C5:1(4)-OH(2) 172026 2-hydroxypent-4-enoic acid NOR00064, NOR00066, NOR00068, NOR00071 NOR00073 Wrong monomer in graphs: C4:1(3)-OH(2) -> C5:1(4)-OH(2) [34]
N-Suc 12522 4-amino-4-oxobutanoic acid NOR00160,NOR00166, NOR00903 Missing monomer in graphs [35, 36]
C5:0-OH(2)-Ep(4) 54305979 2-hydroxy-3-(oxiran-2-yl)propanoic acid NOR00086, NOR00087 Wrong monomer in graphs: C4:0-OH(2)-Ep(3) -> C5:0-OH(2)-Ep(4) [34]
Gen 3469 2,5-dihydroxybenzoic acid NOR00489, NOR00598 Wrong monomer in graphs: 2,3-diOH-Bz -> Gen [37, 38]
C10:0-OH(2)-NH2(3) 57484230 3-amino-2-hydroxydecanoic acid NOR01134, NOR01135 Wrong monomer in graphs: Adda -> C10:0-OH(2)-NH2(3) [39]
iC6:0-OH(2.4) 55300467 2,4-dihydroxy-4-methylpentanoic acid NOR00078, NOR00077 Wrong monomer in graphs: iC5:0-OH(2.3) -> iC6:0-OH(2.4) [34]
Isovaleric_acid 10430 3-methylbutanoic acid NOR00477 Wrong monomer in graph: Hiv -> Isovaleric_acid [40]
D-Cl-Trp 65259 2-amino-3-(6-chloro-1H-indol-3-yl)propanoic acid NOR00554 Wrong SMILES of D-Cl-Trp monomer [41]
  1. Among the suggested monomers, N-Formyl-Lysine is the most abundant. rBAN considers CO as a formylation, therefore suggests a new formylated monomer instead of using the “CO” monomer currently present in Norine. A second new entity present in five compounds is D-3OMe-Ala. In this case the monomer name is correct but not the SMILES associated with it. Most of the other suggestions are due to the monomers wrongly annotated in the graph that should be substituted with a new substructure. There is also one case (N-Suc) where the monomer was directly missing in the graph. All these corrections were manually evaluated to confirm the agreement with the literature