Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation

Table 2 Number of steps taken before the mean exceeds certain internal and external thresholds (earliest sample exceeding threshold is shown in brackets)

	Threshold	Number of steps required for optimization beyond prior at a given threshold					Number of steps required for optimization beyond external thresholds
	Threshold	120%	140%	160%	180%	200%	Inactive mean	Active mean	80% precision threshold
DRD2	REINVENT	> 500 (15)	> 500 (685)	> 500 (22,292)	> 500 (> 32,000)	> 500 (> 32,000)	1 (1)	163 (15)	> 500 (15)
DRD2	Augmented Hill-Climb + DF2	19 (2)	6 (49)	105 (1248)	> 500 (3009)	> 500 (23,150)	2 (2)	19 (2)	48 (2)
OPRM1	REINVENT	133 (7)	> 500 (868)	> 500 (7663)	> 500 (> 32,000)	> 500 (> 32,000)	4 (2)	80 (4)	> 500 (7)
OPRM1	Augmented Hill-Climb + DF2	3 (16)	17 (22)	45 (29)	150 (34)	> 500 (2759)	6 (16)	17 (22)	33 (28)
AGTR1	REINVENT	> 500 (25)	> 500 (510)	> 500 (5,596)	> 500 (> 32,000)	> 500 (> 32,000)	1 (2)	> 500 (8)	419 (6)
AGTR1	Augmented Hill-Climb + DF2	62 (27)	318 (869)	396 (3,404)	> 500 (5,207)	> 500 (27,979)	2 (1)	62 (27)	46 (2)
OX1R	REINVENT	5 (1)	52 (1)	> 500 (7)	> 500 (142)	> 500 (490)	1 (2)	9 (1)	> 500 (490)
OX1R	Augmented Hill-Climb + DF2	9 (1)	15 (2)	31 (2)	87 (31)	382 (557)	2 (1)	14 (2)	494 (557)
Average fold improvement		19.8 (2.5)	11.2 (38.7)	8.3 (71.8)	2.8 (240.6)	1.1 (3.8)	0.5 (1.0)	5.5 (2.1)	9.7 (3.2)

The final row lists the Augmented Hill-Climb in combination with DF2 fold improvement over REINVENT. Where a threshold was not reached within the maximum number of training steps (or samples) it has been annotated as being greater than 500 (or 32,000)

ISSN: 1758-2946