Fig. 1From: Substructure-based neural machine translation for retrosynthetic predictionDescriptor curation based on the rate of occurrences. Filtered US patent reaction dataset and 1 million randomly sampled drug-like small molecules as a subset of the enumerated database (GDB-13) are compared to investigate the MACCS keys probability distribution profilesBack to article page