Abstract
Human gesture recognition to be a new type of natural user interface (NUI) using the person’s gesture action for operating the device is attracting much attention nowadays. In this study, an adaptive hidden Markov model (HMM)-based gesture recognition method with user adaptation (UA) using the Kinect camera to simplify large-scale video processing is designed to be the NUI of a humanoid robot device. The popular Kinect camera is employed for acquiring the gesture signals made by the active user, and the gesture action from the user can then be recognized and used to be as the control command for driving the humanoid robot to imitate the user’s actions. The large-scale video data can be reduced by the Kinect camera where the data from the Kinect camera for representing gesture signals includes the depth measurement information, and therefore only simple 3-axis coordinate information of the joints in a human skeleton is analyzed, categorized and managed in the developed system. By the presented scheme, the humanoid robot will imitate the human active gesture according to the content of the received gesture command. The well-known HMM pattern recognition method with the support of the Kinect device is explored to classify the human’s active gestures where a user adaptation scheme of MAP+GoSSRT that enhances MAP by incorporating group of states shifted by referenced transfer (GoSSRT) is proposed for adjusting HMM parameters, which will further increase the recognition accuracy of HMM gesture recognition. Human gesture recognition experiments for controlling the activity of the humanoid robot were performed on the indicated 14 classes of human active gestures. Experimental results demonstrated the superiority of the NUI by presented HMM gesture recognition with user adaptation for humanoid robot imitation applications.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig1_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig2_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig3_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig4_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig5_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig6_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig7_HTML.gif)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-015-2505-9/MediaObjects/11042_2015_2505_Fig8_HTML.gif)
Similar content being viewed by others
References
Afthoni R, Rizal A, and Susanto E (2013) Proportional derivative control based robot arm system using Microsoft Kinect. Proc. IEEE International Conference on Robotics, Biomimetics, and Intelligent Computational Systems (ROBIONETICS), pp. 24–29
Bhattacharjee D (2014) Adaptive polar transform and fusion for human face image processing and evaluation. Human-Centric Comput Inform Sci 4(4):18
Chakravarty K, Chattopadhyay T (2014) Frontal-standing pose based person identification using kinect. Lect Notes Comput Sci 8511:215–223
Cheng L, Sun Q, Su H, Cong Y, and Zhao S (2012) Design and implementation of human-robot interactive demonstration system based on Kinect. Proc. the 24th Control and Decision Conference (CCDC), pp. 971–975
Ding IJ (2013) Speech recognition using variable-length frame overlaps by intelligent fuzzy control. J Intell Fuzzy Syst 25(1):49–56
Ding IJ (2013) SVM-embedded FLCMAP speaker adaptation using a support vector machine to improve fuzzy controllers of FCMAP. Int J Innov Comput Inf Control 9(2):555–572
Ding IJ and Hsu YM (2014) An HMM-like dynamic time war** scheme for automatic speech recognition. Math Probl Eng 2014:8. Article ID 898729
Ding, IJ, Yen CT and Hsu YM (2013) Developments of machine learning schemes for dynamic time-warp**-based speech recognition. Math Probl Eng 2013:10. Article ID 542680
Feese S, Burscher M, Jonas K, Tröster G (2014) Sensing spatial and temporal coordination in teams using the smartphone. Human-Centric Comput Inform Sci 4(15):18
Ho YS (2013) Challenging technical issues of 3D video processing. J Converg 4(1):1–6
Hoang T, Nguyen T, Luong C, Do S, Choi D (2013) Adaptive cross-device gait recognition using a mobile accelerometer. J Inf Process Syst 9(2):333–348
Kim JS, Byun J, Jeong H (2013) Cloud AEHS: advanced learning system using user preferences. J Converg 4(3):31–36
Kim E, Helal S (2014) Training-free fuzzy logic based human activity recognition. J Inf Process Syst 10(3):335–354
Malkawi M, Murad O (2013) Artificial neuro fuzzy logic system for detecting human emotions. Human-Centric Comput Inform Sci 3(3):13
Oh JS, Kim HY, Moon HN (2014) A study on the diffusion of digital interactive e-books - the development of a user experience mode. J Converg 5(2):21–27
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Sinha A and Chakravarty K (2013) Pose based person identification using kinect. Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 497–503
Tashev I (2013) Kinect development kit: a toolkit for gesture- and speech based human-machine interaction. IEEE Signal Process Mag 30(5):129–131
Verma OP, Jain V, Gumber R (2013) Simple fuzzy rule based edge detection. J Inf Process Syst 9(4):575–591
Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE Multimedia 19(2):4–10
Acknowledgments
This research is partially supported by the Ministry of Science and Technology (MOST) in Taiwan under Grant MOST 103-2218-E-150-004.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ding, IJ., Chang, CW. An adaptive hidden Markov model-based gesture recognition approach using Kinect to simplify large-scale video data processing for humanoid robot imitation. Multimed Tools Appl 75, 15537–15551 (2016). https://doi.org/10.1007/s11042-015-2505-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2505-9