Skip to main content
Fig. 6 | Journal of Cheminformatics

Fig. 6

From: DockStream: a docking wrapper to enhance de novo molecular design

Fig. 6

Average linkage similarity between epochs (Tanimoto) for every 5 epochs for REINVENT-DockStream using GOLD with RDKit and TautEnum against HDAC2 (PDB ID: 3MAX). a GOLD fitness score training plot (same as Fig. 4c). The vertical black line at around epoch 275 indicates the start of convergence whereby the GOLD fitness score begins to plateau. b GOLD Tanimoto matrix illustrating the Tanimoto similarities between batches of generated compounds across the entire 1000 epochs REINVENT-DockStream experiment (x-axis is on the scale of 5 epochs, e.g. epoch 10 and 200 correspond to epoch 50 and 1000, respectively). The main diagonal is darker shaded, indicating notable intra-batch compound similarity. The overall matrix transitions from lighter (top left) to darker (bottom right) shaded areas. Cross-referencing with subplot a, neighbouring epochs display notably greater Tanimoto similarity, coinciding with GOLD fitness Score convergence. The results suggest the agent begins exploitation once a state of productivity is achieved (as measured by fitness score convergence). The overall transition of the matrix demonstrates agent exploration and exploitation is darker (indicating higher similarity) relative to surrounding epochs and gradually becomes even darker, which indicates increased intra-batch similarity as the agent increasingly focuses on regions in chemical space. Moreover, the transition between the lighter shaded top left corner to the darker shaded bottom right corner exemplifies balance between agent exploration and exploitation. By cross-referencing the REINVENT-DockStream training plot for GOLD docking (Fig. 6a), one can identify that the GOLD docking score begins to converge at around epoch 275. At around epoch 55 in Fig. 6b (corresponds to epoch 275), the Tanimoto matrix gradually becomes darker shaded, indicating increased Tanimoto similarity within the same batch and neighbouring epochs batches (Fig. 6b). The results demonstrate policy update, reaching a state of productivity and enforcing the agent to begin exploitation of chemical.

Back to article page