Skip to main content
  • Poster presentation
  • Open access
  • Published:

Predicting the protein localization sites using artificial neural networks

Chemoinformatics, the brain child of Frank Brown [1], has now evolved into a new branch of science, which has high correlations with computer science, bioinformatics, and chemistry. The major functionalities of Chemoinformatics include, but not limited to, chemical structure/property prediction, molecular similarity/diversity analysis, virtual screening, qualitative/quantitative structural/activity/property relationship, design of combinatorial libraries, statistical models, descriptors, drug discovery, representation of chemical compounds/reactions, classification/search/storage methods, management of compound databases, high-throughput docking, data analysis methods, etc. This paper deals with the prediction of localization sites of protein using neural network.

Neural Network [2] provides learning capability and it is one of the important components of softcomputing. A neural network will consist of one input layer, one or more number of hidden layers and an output layer. Number of neurons in the input layer will be equal to the number of features passed to the neural network. Number of neurons in the output layer will be equal to the number of classes for classification purpose. Hidden neurons are usually fixed by experts depending on the problem. There are various types of neural network available like feedforward neural networks, feedback networks, reccurrent networks, self organizing maps, anfis, etc.

In this paper E.coli protein dataset [3] is used for prediction. The data set with 336 instances is having 7 attributes with 8 classes (localization sites). The dataset can be obtained from UCI machine repository. Neural network with 500 hidden neurons and scaled conjugate gradient algorithm are used in this work. The classification result shown in the table 1 for our method, is the average of 4 cross validation and the results are promising.

Table 1 Classification rates.


  1. Brown EK: Chemoinformatics - What is it and How does it Impact Drug Discovery. Ann Rep Med Chem. 1998, 33: 375-384.

    Article  CAS  Google Scholar 

  2. Novic M, Vracko M: Nature-inspired methods in chemometrics: genetic algorithms and ANN. Edited by: In Leardi R. 2003, Data Handling in Science and Technology, 23: Elsevier

    Google Scholar 

  3. Horton P, Nakai K: A Probablistic Classification System for Predicting the Cellular Localization Sites of Proteins. Intell Syst Mol Biol. 1996, 109-115.

    Google Scholar 

  4. Horton P, Nakai K: Better Prediction of protein cellular localization sites with the k nearest neighbours classifier. 1997, Proceedings of ISMB, 147-152.

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Rajesh Reghunadhan.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Arulmozhi, V., Reghunadhan, R. Predicting the protein localization sites using artificial neural networks. J Cheminform 5 (Suppl 1), P46 (2013).

Download citation

  • Published:

  • DOI: