Fig. 1From: Molecular de-novo design through deep reinforcement learningLearning the data. Depiction of maximum likelihood training of an RNN. \(x^t\) are the target sequence tokens we are trying to learn by maximizing \(P(x^t)\) for each stepBack to article page