Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: Memory-assisted reinforcement learning for diverse molecular de novo design

Fig. 1

Schematic workflow of the memory unit. The memory unit (left) is integrated into the regular RL cycle (right). The generative model produces structures, which are scored by an arbitrary scoring function. Only molecules with a high score are processed by the memory unit. Every input molecule is compared to all indexed compounds based on their molecular scaffold or their fingerprint. If the generated scaffold matches an indexed scaffold or the fingerprint similarity is greater than a defined value, the input molecule gets added to the corresponding index-bucket pair. If the buckets are not filled, shown in (a), the memory unit does not alter the scoring. If the bucket is full, illustrated in (b), the score is modified, and the generative model has to explore new chemical structures. For an exemplary compound, the path of structure generation is highlighted. Because the bucket for the corresponding scaffold is filled, the score of this compound is modified

Back to article page