Extended study on atomic featurization in graph neural networks for molecular property prediction

Table 4 Average mean squared error on the test set of the best-performing model for each representation and dataset

Representation	Rat \(\downarrow\)	Human \(\downarrow\)	QM9 \(\downarrow\)	ESOL (random) \(\downarrow\)	ESOL (scaffold) \(\downarrow\)
F A	0.182 0.214	0.218 0.246	9.193 26.369	0.118 0.159	0.166 0.235
A + N A + H A + C A + R A + A	0.188 0.196 0.215 0.194 0.203	0.225 0.248 0.246 0.235 0.241	46.386 41.047 52.825 89.794 27.365	0.113 0.131 0.174 0.115 0.187	0.242 0.215 0.229 0.237 0.212
F-N F-H F-C F-R F-A	0.200 0.183 0.180 0.181 0.178	0.220 0.220 0.213 0.223 0.216	39.243 60.035 9.698 8.278 23.786	0.190 0.113 0.123 0.119 0.120	0.189 0.202 0.201 0.185 0.221
Tree-based baseline	0.207	0.235	699.125	0.432	0.801
XGBoost baseline	0.216	0.233	803.153	0.483	0.452

Two baselines based on the ECFP fingerprints are included. The best results are in bold. The error variance is below 0.001 for all datasets excluding QM9 and thus is not reported. Graph models perform better when trained with representations that include more features and usually outperform baseline models trained on traditional fingerprints

ISSN: 1758-2946