Abstract
When protein secondary structure is predicted, the number of helix, sheet, and coiled coil is quite different in datasets. A multiple classifier based on improved clustering support vector machine (C-SVM) algorithm is proposed to predict protein secondary structure for unbalanced training datasets. Firstly, different weights are used for different types of samples to improve classification accuracy of traditional C-SVM on unbalanced samples. Secondly, the multiple classification strategy of one-versus-one (OVO) is used to build three binary classifiers. These binary classifiers are H/E, H/C, and E/C. The Majority-Voting law is used to integrate the results of three binary classifiers. Finally, sevenfold cross-validation based on grid method is used to optimize the parameters of classifiers. Simulation results show that, compared with the other prediction methods, the classification method proposed in this paper can obtain better classification accuracy on unbalanced datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dai, Q., Li, Y., Liu, X., et al.: Comparison study on statistical features of predicted secondary structures for protein structural class prediction: from content to position. BMC Bioinf. 14, 152 (2013)
Zangooei, M.N., Jalili, S.: Protein secondary structure prediction using DWKF based on SVR-NSGAII. Neurocomputing 94, 87–101 (2012)
Kountouris, P., Agathocleous, M., Promponas, V.J.: A comparative study on filtering protein secondary structure prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 9(3), 731–739 (2012)
Li, Y., Huang, Q., Xu, J., et al.: Research on prediction of streamflow based on C-SVM. Shuili Fadian Xuebao/J. Hydroelectr. Eng. 27(6), 42–47 (2008)
Zhang, T., Xu, X.: Fault diagnosis based on integrated navigation system using C-SVM technology. Zhongguo Guanxing Jishu Xuebao/J. Chin. Inertial Technol. 19(2), 239–242 (2011)
Wu, D.: Multi-class SVM based on improved voting strategy and its application in fault diagnosis. ** Tong Gong Cheng Yu Dian Zi Ji Shu/Syst. Eng. Electron. 31(4), 982–987 (2009)
Hu, H., Li, Y., Liu M., et al.: Classification of defects in steel strip surface based on multiclass support vector machine. Multimedia Tools Appl. 1–18 (2012)
Bryan, J.D., Kwon, J., Lee, N., et al.: Application of ultra-wide band radar for classification of human activities. IET Radar Sonar Navig. 6(3), 172–179 (2012)
Waegeman, W., Verwaeren, J., Slabbinck, B., et al.: Supervised learning algorithms for multi-class classification problems with partial class memberships. Fuzzy Sets Syst. 184(1), 106–125 (2011)
Amini, S., Razzazi, F., Nayebi, K.: A multi-class SVM based phonemes classifier based on a trainable confidence measure. In: IEEE International Symposium on Signal Processing and Information Technology, pp. 49–54 (2009)
**e, X., Yang, B., Chen, Y.: Prediction of secondary structure of protein using neural network. J. Univ. **an 22(2), 111–115 (2008)
Dai, Q.: A competitive ensemble pruning approach based on cross-validation technique. Knowl.-Based Syst. 37, 394–414 (2013)
Bahar, S.F., Clarke, S.: Cross-validation of an employee safety climate model in Malaysia. J. Saf. Res. 45, 1–6 (2013)
Liu, W.Y., Han, J.G.: The optimal Mexican hat wavelet filter de-noising method based on cross-validation method. Neurocomputing 108, 31–35 (2013)
Feng, G.: Parameter optimizing for support vector machines classification. Comput. Eng. Appl. 47(3), 123–124 (2011)
Xu, W., **ng, Z., Li, F.: Research on SVM algorithm based on parameters selection and optimization. J. Shandong Jiaotong Univ. 18(2), 79–82 (2010)
Liu, S., Jia, C., Chen, P.: A weighted support vector machines with automatic parameters selection. Comput. Eng. Appl. 42(2), 64–66 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer India
About this paper
Cite this paper
Pei, A. (2014). Protein Secondary Structure Prediction Based on Improved C-SVM for Unbalanced Datasets. In: Patnaik, S., Li, X. (eds) Proceedings of International Conference on Soft Computing Techniques and Engineering Application. Advances in Intelligent Systems and Computing, vol 250. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1695-7_58
Download citation
DOI: https://doi.org/10.1007/978-81-322-1695-7_58
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1694-0
Online ISBN: 978-81-322-1695-7
eBook Packages: EngineeringEngineering (R0)