Skip to main content
Fig. 2 | Journal of Cheminformatics

Fig. 2

From: An exploration strategy improves the diversity of de novo ligands using deep reinforcement learning: a case for the adenosine A2A receptor

Fig. 2

The workflow of deep reinforcement learning. For each loop, it contains several steps: (1) a batch of SMILES sequences was sampled by the RNN generator. (2) Each generated molecule represented by this SMILES format was encoded into a fingerprint; (3) a probability score of activity on the A2AR was assigned to each molecule, calculated by the QSAR model which had been trained in advance. (4) All of the generated molecules and their scores were sent back for training of the generator with the policy gradient method

Back to article page