Skip to main content

Table 1 Helper functions for SIME. The descriptions for each function were provided to help understand the simplified workflow of SIME algorithm provided in Fig. 11

From: SIME: synthetic insight-based macrolide enumerator to generate the V1B library of 1 billion macrolides

Helper functions

Description

ENUMERATE_sugar_stereocenters (FSG_RS)

Take in sugar strings that start and end with [*R*] and return a list of sugars with two different stereocenters for the joining carbon

enumerate_SM_stereocenters (FSM_RS)

Takes a list of SMs. For SMs with identified stereocenters at the joining point, both R and S configurations for those SMs are generated and added to the all_possible list. For SMs with undefined stereocenters at joining points or without stereocenters, they will remain unchanged and added to the all_possible list. Returns the all_possible list

remove_SM_digits (FRSMD)

Takes a given smile and locates SM points of interest indicated with [1*], [2*], etc. Returns the smile string with all SM points of interest with removed digits

Input — > ’1[1*]234[2*]5[3*]6’

Output — > ’1[*]234[*]5[*]6’

string_splitter (FSS)

Splits a given string into fragments based on a symbol provided and returns a list containing the fragments. For example:

input — > string = ’1[*]234[*]5[*]6’, symbol = ’[*]’

output — > [‘1’, ‘[*]’, ‘234’, ‘[*]’, ‘5’, ‘[*]’, ‘6’]

insert_SMs (FiSM)

Takes in a smile template resulted from string_splitter and replace the ‘[*]’ symbols with a list of SMs

generate_dummy_sugar_templates (FGDST)

This function takes two parameters: smile template, minimal sugars in each macrolide (default is one sugar). For simplification purposes, it generates a list of all possible sugar dummies as ‘SUGARS’ (intended for only sugars) and ‘Full_List’ (intended for sugars + hydroxy) for the number of sugars specified in the given smile template. For example, if there are two sugar positions identified in the given core with at least one sugar allowed, this function will output this result: [(‘SUGARS’, ‘Full_List’), (‘Full_List’, ‘SUGARS’)]. It means that the first and second locations for sugars in one template allow for the list of ‘SUGARS’ and ‘Full_List’ (sugar + hydroxyl) respectively. The second template allow for the full list and the list of sugars in the first and second locations designated for sugars respectively

replace_SYMBOLsugars_with_dummies (FRSSD)

This function takes two inputs: sugar_dummy_order and smile_template_with_sugar_symbols. It splits the given template at [*sugar*] positions wherein the correct dummies (‘SUGARS’ and ‘Full_List’) are inserted

insert_sugars_to_dummies (FIStD)

This function takes the smile template with specified ‘SUGARS’ and ‘Full_List’ after *** function. It then replaces ‘SUGARS’ with an actual list of sugars, and ‘Full_List’ with the list of sugars and a hydroxyl group