Skip to main content

Table 1 Models for optimized LogP using reinforcement learning

From: Memory-assisted reinforcement learning for diverse molecular de novo design

Target Memory type Generated optimized compounds Unique BM scaffolds Unique carbon skeletons
LogP No memory 938 727 396
Compound similarity 3451 2963 1472
IdenticalBMScaffold 3428 2865 1398
IdenticalCarbonSkeleton 3315 3002 1799
ScaffoldSimilarity 3591 3056 1538
  1. The generative models were tuned for generating compounds with a predicted LogP between 2.0 and 3.0 using RL for 100 iterations. During each iteration, a model generated 150 compounds resulting in a total of 15.000 compounds. Only compounds with a predicted LogP between 2.0 and 3.0 were retained