Skip to main content

Table 1 Results of re-ranking four one-step models on the USPTO-50K test dataset

From: Improving the performance of models for one-step retrosynthesis through re-ranking

Models Top-N accuracy (%) Mean Reciprocal Rank
1 3 5 10 20 50
RetroSim 35.7 (\(\pm 0\)) 53.3 (\(\pm 0\)) 62.0 (\(\pm 0\)) 73.4 (\(\pm 0\)) 82.3 (\(\pm 0\)) 88.5 (\(\pm 0\)) 0.477 (\(\pm 0.000\))
RetroSim + FF-EBM 49.7 (\(\pm 0.34\)) 72.3 (\(\pm 0.21\)) 79.4 (\(\pm 0.15\)) 85.5 (\(\pm 0.13\)) 88.1 (\(\pm 0.07\)) 88.9 (\(\pm 0.01\)) 0.622 (\(\pm 0.002\))
RetroSim + Graph-EBM 51.8 (\(\pm 0.43\)) 74.5 (\(\pm 0.37\)) 81.1 (\(\pm 0.17\)) 86.4 (\(\pm 0.13\)) 88.5 (\(\pm 0.02\)) 88.9 (\(\pm 0.00\)) 0.644 (\(\pm 0.004\))
NeuralSym 45.7 (\(\pm 0.30\)) 66.4 (\(\pm 0.40\)) 73.5 (\(\pm 0.30\)) 80.7 (\(\pm 0.21\)) 85.3 (\(\pm 0.34\)) 87.3 (\(\pm 0.32\)) 0.578 (\(\pm 0.001\))
NeuralSym + FF-EBM 50.5 (\(\pm 0.21\)) 71.8 (\(\pm 0.62\)) 78.7 (\(\pm 0.18\)) 84.5 (\(\pm 0.32\)) 87.1 (\(\pm 0.29\)) 87.5 (\(\pm 0.32\)) 0.626 (\(\pm 0.003\))
NeuralSym + Graph-EBM 51.3 (\(\pm 0.52\)) 73.6 (\(\pm 0.34\)) 80.2 (\(\pm 0.35\)) 85.4 (\(\pm 0.30\)) 87.1 (\(\pm 0.27\)) 87.5 (\(\pm 0.32\)) 0.636 (\(\pm 0.004\))
RetroXpert 45.8 (\(\pm 0.25\)) 59.2 (\(\pm 0.26\)) 63.0 (\(\pm 0.57\)) 66.9 (\(\pm 0.31\)) 69.9 (\(\pm 0.62\)) 73.0 (\(\pm 0.70\)) 0.543 (\(\pm 0.004\))
RetroXpert + FF-EBM 42.7 (\(\pm 0.27\)) 62.0 (\(\pm 0.21\)) 67.6 (\(\pm 0.05\)) 72.5 (\(\pm 0.08\)) 75.6 (\(\pm 0.11\)) 77.1 (\(\pm 0.20\)) 0.536 (\(\pm 0.002\))
RetroXpert + Graph-EBM 36.7 (\(\pm 0.91\)) 58.2 (\(\pm 1.06\)) 65.8 (\(\pm 0.73\)) 73.0 (\(\pm 0.32\)) 75.9 (\(\pm 0.12\)) 77.3 (\(\pm 0.21\)) 0.491 (\(\pm 0.008\))
GLN 51.7 (\(\pm 0.33\)) 67.8 (\(\pm 0.43\)) 75.1 (\(\pm 0.32\)) 83.2 (\(\pm 0.12\)) 88.9 (\(\pm 0.11\)) 92.4 (\(\pm 0.06\)) 0.620 (\(\pm 0.003\))
GLN + FF-EBM 49.7 (\(\pm 0.77\)) 72.4 (\(\pm 0.18\)) 80.0 (\(\pm 0.28\)) 87.0 (\(\pm 0.11\)) 90.6 (\(\pm 0.12\)) 93.0 (\(\pm 0.02\)) 0.629 (\(\pm 0.005\))
GLN + Graph-EBM 52.3 (\(\pm 0.01\)) 74.9 (\(\pm 0.27\)) 82.0 (\(\pm 0.18\)) 88.0 (\(\pm 0.02\)) 91.4 (\(\pm 0.11\)) 93.0 (\(\pm 0.08\)) 0.652 (\(\pm 0.001\))
  1. Bolded values refer to the best top-N accuracy and the best MRR for that one-step model. We report the average of 3 experiments where both the proposer and re-ranker are initialized with a different random seed, with the standard deviation in parentheses. Note that RetroSim is a deterministic algorithm and is reported with a standard deviation of 0