From: Improving chemical reaction yield prediction using pre-trained graph neural networks
Dataset | Split | Previous studies | Existing GNN pre-training methods | Proposed method | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
 |  | MFF [4] | YieldBERT [6] | YieldBERT-DA [7] | YieldMPNN [8] | From-Scratch | MolCLR [13] | DGI [18] | ContextPred [21] | AttrMasking [21] | MolDescPred-MPNN | MolDescPred |
Buchwald-Hartwig (Random Split) | 70/30 | 7.116±0.327 | 6.014±0.272 | 4.799±0.261 | 4.433±0.085 | 4.616±0.163 | 4.405±0.091 | 4.408±0.097 | 4.388±0.092 | 4.386±0.125 | 4.430±0.104 | 4.407±0.089 |
50/50 | 8.051±0.322 | 7.288±0.198 | 5.877±0.348 | 5.387±0.202 | 6.088±0.982 | 5.279±0.167 | 5.364±0.222 | 5.327±0.183 | 5.328±0.216 | 5.326±0.231 | 5.263±0.181 | |
30/70 | 9.492±0.364 | 9.338±0.424 | 7.822±0.463 | 6.970±0.403 | 7.557±0.473 | 6.837±0.387 | 6.963±0.403 | 6.947±0.400 | 6.944±0.407 | 6.899±0.394 | 6.850±0.400 | |
20/80 | 10.487±0.259 | 10.306±0.303 | 9.164±0.668 | 8.204±0.372 | 9.317±0.713 | 8.040±0.399 | 8.271±0.498 | 8.175±0.333 | 8.268±0.398 | 8.093±0.365 | 8.043±0.426 | |
10/90 | 12.450±0.357 | 12.393±0.499 | 11.633±0.293 | 10.875±0.448 | 13.232±0.880 | 10.816±0.537 | 10.935±0.553 | 10.982±0.473 | 10.912±0.672 | 10.945±0.466 | 10.648±0.544 | |
5/95 | 14.994±0.593 | 16.740±0.950 | 14.073±0.687 | 14.041±0.492 | 18.188±2.789 | 13.873±0.485 | 14.068±0.728 | 13.911±0.601 | 14.250±0.537 | 13.542±0.681 | 13.117±0.792 | |
2.5/97.5 | 17.731±0.970 | 20.463±0.623 | 17.151±0.677 | 16.586±1.364 | 21.081±3.116 | 16.414±1.134 | 16.845±1.334 | 17.526±1.680 | 16.722±0.938 | 16.798±0.935 | 15.817±1.250 | |
 | avg. rank | 10.29±0.88 | 9.86±0.35 | 8.00±0.76 | 5.29±1.67 | 9.57±1.29 | 2.00±0.76 | 5.86±0.64 | 4.86±1.88 | 4.57±1.99 | 4.00±1.51 | 1.71±1.03 |
Suzuki-Miyaura (Random Split) | 70/30 | 11.428±0.341 | 12.073±0.463 | 10.524±0.482 | 9.467±0.459 | 9.742±0.489 | 9.289±0.516 | 9.430±0.474 | 9.297±0.462 | 9.225±0.465 | 9.271±0.446 | 9.333±0.478 |
50/50 | 12.208±0.169 | 13.148±0.270 | 11.797±0.250 | 10.225±0.135 | 10.691±0.171 | 10.155±0.142 | 10.222±0.191 | 10.091±0.164 | 10.156±0.183 | 10.097±0.157 | 10.133±0.164 | |
30/70 | 13.347±0.148 | 14.614±0.381 | 13.337±0.357 | 11.593±0.136 | 12.449±0.450 | 11.542±0.190 | 11.771±0.181 | 11.569±0.194 | 11.654±0.159 | 11.507±0.175 | 11.550±0.222 | |
20/80 | 14.347±0.335 | 15.966±0.381 | 14.851±0.576 | 12.734±0.347 | 14.404±0.902 | 12.736±0.322 | 13.051±0.351 | 12.837±0.363 | 12.911±0.345 | 12.650±0.324 | 12.717±0.225 | |
10/90 | 16.062±0.445 | 18.734±0.530 | 17.129±0.683 | 15.164±0.344 | 17.813±1.236 | 15.239±0.399 | 15.520±0.444 | 15.371±0.452 | 15.739±0.523 | 14.973±0.395 | 15.050±0.256 | |
5/95 | 17.927±0.484 | 21.181±0.724 | 20.016±0.661 | 18.511±0.392 | 20.665±1.823 | 18.982±1.000 | 18.332±0.421 | 18.487±0.431 | 19.430±0.760 | 17.720±0.466 | 17.891±0.351 | |
2.5/97.5 | 20.199±1.096 | 22.967±0.804 | 23.780±0.793 | 22.943±0.887 | 23.878±3.170 | 22.692±2.048 | 22.495±0.965 | 22.519±0.762 | 23.088±0.806 | 21.829±0.774 | 21.338±0.908 | |
 | avg. rank | 7.14±3.40 | 10.57±1.05 | 9.29±0.45 | 5.43±1.68 | 9.14±1.12 | 4.29±1.58 | 5.71±1.16 | 4.14±1.36 | 6.00±2.39 | 1.57±0.73 | 2.71±1.03 |
Buchwald-Hartwig (Out-Of-Sample Split) | Test 1 | 9.369±0.151 | 11.441±0.342 | 11.761±1.398 | 13.746±1.175 | 16.956±1.913 | 9.559±0.871 | 13.484±0.636 | 13.398±1.480 | 10.219±0.646 | 11.343±0.346 | 9.320±0.376 |
Test 2 | 14.163±0.155 | 11.144±1.267 | 9.886±0.741 | 9.476±1.027 | 9.474±0.829 | 9.274±1.016 | 11.511±1.711 | 9.439±1.103 | 8.883±0.697 | 9.860±1.349 | 8.002±0.472 | |
Test 3 | 16.629±0.141 | 14.276±0.820 | 18.041±1.395 | 14.939±0.622 | 17.471±1.777 | 17.681±0.757 | 17.053±0.429 | 16.404±1.127 | 16.608±0.310 | 16.659±0.616 | 13.726±0.814 | |
Test 4 | 20.698±0.135 | 19.679±1.397 | 24.279±0.494 | 18.774±0.566 | 19.954±3.058 | 19.044±0.370 | 23.295±0.244 | 22.858±1.064 | 19.229±0.587 | 19.507±0.745 | 20.780±0.767 | |
 | avg.rank | 6.50±3.20 | 5.50±2.50 | 9.25±1.79 | 5.00±3.39 | 7.75±2.38 | 4.50±3.20 | 9.25±0.83 | 6.25±2.28 | 3.50±1.12 | 5.75±1.30 | 2.75±3.03 |