Log in

An adaptive hidden Markov model-based gesture recognition approach using Kinect to simplify large-scale video data processing for humanoid robot imitation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Human gesture recognition to be a new type of natural user interface (NUI) using the person’s gesture action for operating the device is attracting much attention nowadays. In this study, an adaptive hidden Markov model (HMM)-based gesture recognition method with user adaptation (UA) using the Kinect camera to simplify large-scale video processing is designed to be the NUI of a humanoid robot device. The popular Kinect camera is employed for acquiring the gesture signals made by the active user, and the gesture action from the user can then be recognized and used to be as the control command for driving the humanoid robot to imitate the user’s actions. The large-scale video data can be reduced by the Kinect camera where the data from the Kinect camera for representing gesture signals includes the depth measurement information, and therefore only simple 3-axis coordinate information of the joints in a human skeleton is analyzed, categorized and managed in the developed system. By the presented scheme, the humanoid robot will imitate the human active gesture according to the content of the received gesture command. The well-known HMM pattern recognition method with the support of the Kinect device is explored to classify the human’s active gestures where a user adaptation scheme of MAP+GoSSRT that enhances MAP by incorporating group of states shifted by referenced transfer (GoSSRT) is proposed for adjusting HMM parameters, which will further increase the recognition accuracy of HMM gesture recognition. Human gesture recognition experiments for controlling the activity of the humanoid robot were performed on the indicated 14 classes of human active gestures. Experimental results demonstrated the superiority of the NUI by presented HMM gesture recognition with user adaptation for humanoid robot imitation applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Afthoni R, Rizal A, and Susanto E (2013) Proportional derivative control based robot arm system using Microsoft Kinect. Proc. IEEE International Conference on Robotics, Biomimetics, and Intelligent Computational Systems (ROBIONETICS), pp. 24–29

  2. Bhattacharjee D (2014) Adaptive polar transform and fusion for human face image processing and evaluation. Human-Centric Comput Inform Sci 4(4):18

    Google Scholar 

  3. Chakravarty K, Chattopadhyay T (2014) Frontal-standing pose based person identification using kinect. Lect Notes Comput Sci 8511:215–223

    Article  Google Scholar 

  4. Cheng L, Sun Q, Su H, Cong Y, and Zhao S (2012) Design and implementation of human-robot interactive demonstration system based on Kinect. Proc. the 24th Control and Decision Conference (CCDC), pp. 971–975

  5. Ding IJ (2013) Speech recognition using variable-length frame overlaps by intelligent fuzzy control. J Intell Fuzzy Syst 25(1):49–56

    Google Scholar 

  6. Ding IJ (2013) SVM-embedded FLCMAP speaker adaptation using a support vector machine to improve fuzzy controllers of FCMAP. Int J Innov Comput Inf Control 9(2):555–572

    Google Scholar 

  7. Ding IJ and Hsu YM (2014) An HMM-like dynamic time war** scheme for automatic speech recognition. Math Probl Eng 2014:8. Article ID 898729

  8. Ding, IJ, Yen CT and Hsu YM (2013) Developments of machine learning schemes for dynamic time-warp**-based speech recognition. Math Probl Eng 2013:10. Article ID 542680

  9. Feese S, Burscher M, Jonas K, Tröster G (2014) Sensing spatial and temporal coordination in teams using the smartphone. Human-Centric Comput Inform Sci 4(15):18

    Google Scholar 

  10. Ho YS (2013) Challenging technical issues of 3D video processing. J Converg 4(1):1–6

    Google Scholar 

  11. Hoang T, Nguyen T, Luong C, Do S, Choi D (2013) Adaptive cross-device gait recognition using a mobile accelerometer. J Inf Process Syst 9(2):333–348

    Article  Google Scholar 

  12. Kim JS, Byun J, Jeong H (2013) Cloud AEHS: advanced learning system using user preferences. J Converg 4(3):31–36

    Google Scholar 

  13. Kim E, Helal S (2014) Training-free fuzzy logic based human activity recognition. J Inf Process Syst 10(3):335–354

    Article  Google Scholar 

  14. Malkawi M, Murad O (2013) Artificial neuro fuzzy logic system for detecting human emotions. Human-Centric Comput Inform Sci 3(3):13

    Google Scholar 

  15. Oh JS, Kim HY, Moon HN (2014) A study on the diffusion of digital interactive e-books - the development of a user experience mode. J Converg 5(2):21–27

    Google Scholar 

  16. Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Article  Google Scholar 

  17. Sinha A and Chakravarty K (2013) Pose based person identification using kinect. Proc. IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 497–503

  18. Tashev I (2013) Kinect development kit: a toolkit for gesture- and speech based human-machine interaction. IEEE Signal Process Mag 30(5):129–131

    Article  Google Scholar 

  19. Verma OP, Jain V, Gumber R (2013) Simple fuzzy rule based edge detection. J Inf Process Syst 9(4):575–591

    Article  Google Scholar 

  20. Zhang Z (2012) Microsoft kinect sensor and its effect. IEEE Multimedia 19(2):4–10

    Article  Google Scholar 

Download references

Acknowledgments

This research is partially supported by the Ministry of Science and Technology (MOST) in Taiwan under Grant MOST 103-2218-E-150-004.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ing-Jr Ding.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, IJ., Chang, CW. An adaptive hidden Markov model-based gesture recognition approach using Kinect to simplify large-scale video data processing for humanoid robot imitation. Multimed Tools Appl 75, 15537–15551 (2016). https://doi.org/10.1007/s11042-015-2505-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-015-2505-9

Keywords

Navigation