Predicting the protein localization sites using artificial neural networks
Journal of Cheminformatics volume 5, Article number: P46 (2013)
Chemoinformatics, the brain child of Frank Brown , has now evolved into a new branch of science, which has high correlations with computer science, bioinformatics, and chemistry. The major functionalities of Chemoinformatics include, but not limited to, chemical structure/property prediction, molecular similarity/diversity analysis, virtual screening, qualitative/quantitative structural/activity/property relationship, design of combinatorial libraries, statistical models, descriptors, drug discovery, representation of chemical compounds/reactions, classification/search/storage methods, management of compound databases, high-throughput docking, data analysis methods, etc. This paper deals with the prediction of localization sites of protein using neural network.
Neural Network  provides learning capability and it is one of the important components of softcomputing. A neural network will consist of one input layer, one or more number of hidden layers and an output layer. Number of neurons in the input layer will be equal to the number of features passed to the neural network. Number of neurons in the output layer will be equal to the number of classes for classification purpose. Hidden neurons are usually fixed by experts depending on the problem. There are various types of neural network available like feedforward neural networks, feedback networks, reccurrent networks, self organizing maps, anfis, etc.
In this paper E.coli protein dataset  is used for prediction. The data set with 336 instances is having 7 attributes with 8 classes (localization sites). The dataset can be obtained from UCI machine repository. Neural network with 500 hidden neurons and scaled conjugate gradient algorithm are used in this work. The classification result shown in the table 1 for our method, is the average of 4 cross validation and the results are promising.
Brown EK: Chemoinformatics - What is it and How does it Impact Drug Discovery. Ann Rep Med Chem. 1998, 33: 375-384.
Novic M, Vracko M: Nature-inspired methods in chemometrics: genetic algorithms and ANN. Edited by: In Leardi R. 2003, Data Handling in Science and Technology, 23: Elsevier
Horton P, Nakai K: A Probablistic Classification System for Predicting the Cellular Localization Sites of Proteins. Intell Syst Mol Biol. 1996, 109-115.
Horton P, Nakai K: Better Prediction of protein cellular localization sites with the k nearest neighbours classifier. 1997, Proceedings of ISMB, 147-152.
About this article
Cite this article
Arulmozhi, V., Reghunadhan, R. Predicting the protein localization sites using artificial neural networks. J Cheminform 5 (Suppl 1), P46 (2013). https://doi.org/10.1186/1758-2946-5-S1-P46
- Neural Network
- Hide Layer
- Output Layer
- Input Layer
- Localization Site