Fig. 3From: DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approachThe reinforcement learning pathway for systemic generation of molecules (Redrawn from You et al. [34]). a The state is defined as the current graph \( G_{t} \) and the possible atom types \( C \). b The GCPN conducts message passing to encode the state as node embeddings and estimates the policy function. c The action to be performed (\( a_{t} \)) is sampled from the policy function. The environment performs a chemical valency check on the intermediate state and returns (d) the next state \( G_{t} \) and (e) the associated reward (\( r_{t} \))Back to article page