Abstract
Protein classification is one of the challenging problems in computational biology and bioinformatics. Our aim here is to classify proteins into different families using the surface roughness similarity of proteins as a criterion. Because Protein Data Bank (PDB) (http://www.rcsb.org/pdb/ [1]) coordinates give no indication of the orientation of the protein, we designed an invariant coordinate system (ICS) in which we took as the origin the protein’s center of gravity (CG). From PDB we found the surface residue coordinates. We then divided those coordinates into eight octants based on the sign of x, y and z coordinates. For the residues in each octant, we found the standard deviation of the coordinates and created a parameter called the surface-invariant coordinate (SIC). Thus, for every protein we obtained eight SIC values. We also made use of the Structural Classification of Proteins (SCOP) (http://scop.mrc-lmb.cam.ac.uk/scop/ [2]) database. SCOP classifies proteins on the basis of the surface structure of the protein. As it is a classification problem, we used the naïve Bayes classifier algorithm for the classification to achieve better results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Connolly, M.L. 1986. Measurement of protein surface shape by solid angles. Journal of Molecular Graphics 4: 3–6.
Bandyopadhyay, S. 2005. An efficient technique for superfamilyclassification of amino acid sequences: Feature extraction, fuzzy clustering and prototype selection. ELSEVIER Jounal of FuzzySets and Systems 152: 5–16.
Vipsita, S., B.K. Shee and S.K. Rath. 2010. An efficient technique for protein classification using feature extraction by artificial neural networks IEEE India conference: Green energy, computing and communication, INDICON.
Wang, D., and G.B. Huang. 2005. Protein sequence classification using extreme learning machine. In Proceedings of international joint conference on neural networks (IJCNN, 2005), Montreal, Canada.
Brink, Henrik, Joseph W. Richards, and Mark Fetherolf. Real-World Machine Learning. ISBN 9781617291920.
Datta, A., V. Talukdar, A. Konar, and L.C. Jain. 2009. A neural network based approach for protein structural class prediction. Journal ofIntelligent and Fuzzy Systems 20: 61–71.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Satpute, B.S., Yadav, R. (2019). Classification of Proteins Using Naïve Bayes Classifier and Surface-Invariant Coordinates. In: Krishna, A., Srikantaiah, K., Naveena, C. (eds) Integrated Intelligent Computing, Communication and Security. Studies in Computational Intelligence, vol 771. Springer, Singapore. https://doi.org/10.1007/978-981-10-8797-4_12
Download citation
DOI: https://doi.org/10.1007/978-981-10-8797-4_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8796-7
Online ISBN: 978-981-10-8797-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)