Skip to main content
Fig. 3 | Journal of Cheminformatics

Fig. 3

From: SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering

Fig. 3

The key sites out of the sites with the top 20 largest attention scores on the wildtype sequence. A and B: The key sites of GFP have been marked as red spheres. A: 5 key sites were recovered by our model. G65 and T201 are the active residues helping to form and stabilize the chromophore in GFP as described by Ref [30]. P73, G230 and R71 are among the experimentally-discovered top 20 sites, which render the highest change of fitness when mutated. B: 3 key sites were identified by the model when removing the structure module. Y37 and L219 are among the experimentally-discovered top 20 AA sites. Q181 is the active residue. C and D: The key sites of RRM have been marked as red spheres. C: 11 key sites were recovered by the original model. N7, L28, S29, K31, A33, T34 and K39 are the binding sites which are within 5 Å of the RNA molecules. F4, L8, I12, I27, S29 and K31 are among the experimentally-discovered top 20 sites, which render the highest change of fitness when mutated. D: There are 3 key sites identified by the model when removing the structure module. S29 is the binding site. A57 and I71 are among the experimentally-discovered top 20 sites

Back to article page