Improving chemical reaction yield prediction using pre-trained graph neural networks

Han, Jongmin; Kwon, Youngchun; Choi, Youn-Suk; Kang, Seokho

doi:10.1186/s13321-024-00818-z

Journal of Cheminformatics

Table 2 Comparison of predictive performance in terms of MAE

From: Improving chemical reaction yield prediction using pre-trained graph neural networks

Dataset	Split	Previous studies				Existing GNN pre-training methods					Proposed method
		MFF [4]	YieldBERT [6]	YieldBERT-DA [7]	YieldMPNN [8]	From-Scratch	MolCLR [13]	DGI [18]	ContextPred [21]	AttrMasking [21]	MolDescPred-MPNN	MolDescPred
Buchwald-Hartwig (Random Split)	70/30	4.694±0.116	3.990±0.153	3.090±0.118	2.920±0.056	3.038±0.096	2.896±0.060	2.909±0.060	2.888±0.060	2.905±0.049	2.921±0.054	2.899±0.061
	50/50	5.370±0.134	4.792±0.124	3.744±0.150	3.497±0.090	3.957±0.796	3.420±0.054	3.488±0.074	3.465±0.057	3.485±0.078	3.463±0.082	3.439±0.054
	30/70	6.471±0.183	6.075±0.222	4.833±0.167	4.483±0.165	4.873±0.244	4.400±0.152	4.489±0.150	4.462±0.132	4.496±0.160	4.439±0.137	4.408±0.147
	20/80	7.271±0.200	6.862±0.212	5.781±0.252	5.311±0.154	6.119±0.415	5.197±0.169	5.345±0.203	5.309±0.146	5.392±0.170	5.240±0.170	5.196±0.187
	10/90	8.962±0.308	8.607±0.387	7.705±0.236	7.196±0.274	9.077±0.809	7.158±0.269	7.304±0.268	7.286±0.209	7.269±0.359	7.266±0.250	7.061±0.262
	5/95	11.085±0.322	12.117±0.789	9.651±0.338	9.677±0.408	14.043±2.879	9.932±0.408	9.688±0.467	9.614±0.393	9.716±0.392	9.434±0.418	9.058±0.463
	2.5/97.5	13.592±0.950	15.979±0.817	12.243±0.631	11.747±1.005	16.003±2.434	11.903±0.815	11.870±0.823	12.512±1.239	11.775±0.647	12.075±0.622	11.304±0.952
	avg. rank	10.29±0.88	9.86±0.35	7.43±1.50	4.71±1.58	9.71±1.16	3.00±2.39	5.71±0.88	4.29±2.05	5.43±1.50	4.00±1.69	1.57±0.73
Suzuki-Miyaura (Random Split)	70/30	7.904±0.169	8.128±0.344	6.598±0.270	6.116±0.223	6.323±0.245	6.038±0.264	6.096±0.263	6.053±0.253	6.037±0.243	6.038±0.226	6.045±0.218
	50/50	8.522±0.118	8.922±0.235	7.539±0.153	6.725±0.089	7.053±0.133	6.676±0.088	6.729±0.138	6.661±0.119	6.702±0.141	6.629±0.112	6.667±0.101
	30/70	9.502±0.106	10.094±0.346	8.804±0.249	7.847±0.094	8.502±0.295	7.778±0.134	7.953±0.109	7.822±0.120	7.887±0.116	7.751±0.082	7.793±0.147
	20/80	10.360±0.212	11.229±0.247	10.017±0.338	8.793±0.191	10.008±0.613	8.785±0.181	9.022±0.194	8.890±0.227	8.918±0.207	8.691±0.213	8.775±0.161
	10/90	11.890±0.268	13.528±0.395	11.954±0.443	10.739±0.211	12.839±1.154	10.863±0.249	11.017±0.304	10.948±0.320	11.171±0.330	10.591±0.233	10.781±0.182
	5/95	13.545±0.281	15.695±0.618	14.294±0.507	13.451±0.353	15.307±1.530	14.691±1.191	13.381±0.301	13.543±0.248	14.120±0.513	12.934±0.364	13.236±0.299
	2.5/97.5	15.640±0.813	17.666±0.496	17.587±0.690	17.189±0.813	18.289±2.538	18.129±2.291	16.928±0.737	16.817±0.467	16.997±0.716	16.324±0.593	16.114±0.697
	avg. rank	7.86±3.14	10.71±0.70	8.71±0.45	5.00±1.69	9.00±1.20	4.86±3.04	5.86±1.36	4.29±1.03	5.43±1.92	1.43±0.73	2.71±0.70
Buchwald-Hartwig (Out-Of-Sample Split)	Test 1	6.682±0.101	7.351±0.099	7.015±0.758	8.082±0.827	10.941±1.385	6.358±0.605	7.955±0.344	8.357±1.108	6.609±0.411	7.020±0.173	5.980±0.231
	Test 2	9.459±0.112	7.266±0.724	6.588±0.328	6.300±0.647	6.359±0.524	6.412±0.637	7.649±0.893	6.421±0.607	5.997±0.499	6.398±0.785	5.469±0.396
	Test 3	10.282±0.150	9.129±0.745	11.052±0.950	8.986±0.314	11.021±1.509	11.154±0.596	10.240±0.546	9.780±1.087	10.106±0.268	10.639±0.576	8.340±0.351
	Test 4	14.874±0.050	13.671±1.067	18.422±0.620	13.190±0.754	14.414±2.982	13.231±0.266	16.719±0.598	16.084±1.174	13.910±0.320	13.616±0.597	13.870±0.393
	avg.rank	7.50±2.50	5.75±2.38	8.50±2.29	3.75±3.11	7.75±2.59	5.25±3.70	8.50±1.66	7.50±2.29	4.00±1.58	5.50±1.80	2.00±1.73

The best and second-best cases are highlighted in bold and underlined font, respectively

Back to article page

ISSN: 1758-2946

Contact us

Submission enquiries: journalsubmissions@springernature.com