Skip to main content
Fig. 3 | Journal of Cheminformatics

Fig. 3

From: Adaptive language model training for molecular design

Fig. 3

Optimization of molecules for drug-likeness and synthesizability produced by a fixed language model, adaptive language model, or fixed language model without a genetic algorithm based optimization scheme. Two datasets (GDB9 and a custom dataset with the top scoring molecules for drug-likeness and synthesizability) are used as initial data. In the Fitness vs Generations subplots, the y-axis is the average fitness of the population over six runs. The related standard deviations are small compared to the mean values in the order of 0.1%\(-\)0.2%. The fixed approach (blue) results in a faster increase in fitness, along with greater valid and accepted molecules for the GDB9 dataset. For the top dataset, however, the adaptive approach leads to a faster increase in fitness along with greater accepted molecules. Both the adaptive and fixed approaches outperform the baseline of a fixed language model without the genetic algorithm. The histograms show synthesizability and drug-likeness of the final population after six generations for each approach

Back to article page