Skip to main content

Table 2 List of BioProt computed features for protein sequences

From: BioTriangle: a web-accessible platform for generating various molecular representations for chemicals, proteins, DNAs/RNAs and their interactions

Feature group Features Number of descriptors
Amino acid composition Amino acid composition 20
Dipeptide composition 400
Tripeptide composition 8000
Autocorrelation Normalized Moreau–Broto autocorrelation 240a
Moran autocorrelation 240a
Geary autocorrelation 240a
CTD Composition 21
Transition 21
Distribution 105
Conjoint triad Conjoint triad features 343
Quasi-sequence order Sequence order coupling number 60
Quasi-sequence order descriptors 100
Pseudo amino acid composition Pseudo amino acid composition 50b
Amphiphilic pseudo amino acid composition 50c
  1. aThe number depends on the choice of the number of properties of amino acid and the choice of the maximum values of the lag. The default is eight types of properties and lag = 30
  2. bThe number depends on the choice of the number of the set of amino acid properties and the choice of the λ value. The default is three types of properties proposed by Chou et al. and λ = 30
  3. cThe number depends on the choice of the λ value. The default is that λ = 15