Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering

Fig. 1

Architecture of model and the schematic of data-augmentation strategy. Architecture of SESNet): The local encoder accounts for the inter-residue dependence in a protein learned from MSA of homologous sequences using a Markov random field [27]. The global encoder captures the sequence feature in global protein sequence universe using protein language model [6]. The structure module accounts for the microscopically environmental feature of a residue learned from 3D geometric structure of the protein [23, 28]. Schematic of data-augmentation strategy. B: We first build a mutant library containing all of the single-site mutants and numerous double-site mutants. Then, all of these mutated sequences are scored by the unsupervised model. After that, these mutants are used to pre-train the initial model (SESNet), which will be further finetuned on a small number of low-order experimental mutational data

Back to article page