Fig. 3From: SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineeringThe key sites out of the sites with the top 20 largest attention scores on the wildtype sequence. A and B: The key sites of GFP have been marked as red spheres. A: 5 key sites were recovered by our model. G65 and T201 are the active residues helping to form and stabilize the chromophore in GFP as described by Ref [30]. P73, G230 and R71 are among the experimentally-discovered top 20 sites, which render the highest change of fitness when mutated. B: 3 key sites were identified by the model when removing the structure module. Y37 and L219 are among the experimentally-discovered top 20 AA sites. Q181 is the active residue. C and D: The key sites of RRM have been marked as red spheres. C: 11 key sites were recovered by the original model. N7, L28, S29, K31, A33, T34 and K39 are the binding sites which are within 5Â Ã… of the RNA molecules. F4, L8, I12, I27, S29 and K31 are among the experimentally-discovered top 20 sites, which render the highest change of fitness when mutated. D: There are 3 key sites identified by the model when removing the structure module. S29 is the binding site. A57 and I71 are among the experimentally-discovered top 20 sitesBack to article page