The Challenges of Reinforcement Learning in Robotics and Optimal Control

  • Conference paper
  • First Online:
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016 (AISI 2016)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 533))

  • 2702 Accesses

Abstract

Reinforcement Learning (RL) is an emerging technology for designing control systems that find optimal policy, through simulated or actual experience, according to a performance measure given by the designer. This paper discusses a widely used RL algorithm called Q-learning. This paper discuss how to apply these algorithms to robotics and optimal control systems, where several key challenges must be addressed for it to be useful. We discuss how Q-learning algorithm can adapted to work in continuous states and action spaces, the methods for computing rewards which generates an adaptive optimal controller and accelerate learning process and finally the safe exploration approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now
Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 160.49
Price includes VAT (France)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 210.99
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sutton, R., Barto, B.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Cao, X.: Stochastic Learning and Optimization. Springer, Heidelberg (2009)

    Google Scholar 

  3. Bertsekas, D.: Neuro-Dynamic Programming. Athena Scientific, Belmon (1996)

    MATH  Google Scholar 

  4. Sutton, R., Barto, A., Williams, R.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. 12, 19–22 (1992)

    Google Scholar 

  5. Tesauro, G.: TD-Gammon, a self-teaching backgammon program achieves master-level Play. Neural Comput. 6(2), 215–219 (1994)

    Google Scholar 

  6. Szepesvri, C.: Algorithms for Reinforcement Learning. Morgan and Claypool, San Rafael (2010)

    Google Scholar 

  7. Randlv, P., Alstrm, P.: Learning to drive a bicycle using reinforcement learning and sha**. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 463–471 (1998)

    Google Scholar 

  8. Abbeel, P., Coates, A., Quigley, M., Ng, A.: An application of reinforcement learning to aerobatic helicopter flight. In: Advances in Neural Information Processing Systems, vol. 19. MIT press (2007)

    Google Scholar 

  9. Wang, F., Zhang, H., Liu, D.: Adaptive dynamic programming: an introduction. IEEE Comput. Intell. Mag. 4(2), 39–47 (2009)

    Google Scholar 

  10. Busoniu, L., Babuska, R., De Schutter, B., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, New York (2010)

    MATH  Google Scholar 

  11. Kaelbling, L., Littman, M., Moore, A.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)

    Google Scholar 

  12. Kober, J., Bagnell, J., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32, 1238–1274 (2013)

    Google Scholar 

  13. Bhasin, S.: Reinforcement learning and optimal control methods for uncertainnonlinear systems. Ph.D., University of Florida (2011)

    Google Scholar 

  14. Hester, T., Quinlan, M., Stone, P.: RTMBA: a real-time model-based reinforcement learning architecture for robot control. In: International Conference on Robotics and Automation, ICRA 2010, pp. 85–90 (2012)

    Google Scholar 

  15. Kim, H., Jordan, M., Sastry, S., Ng, A.: Autonomous helicopter flight via reinforcement learning. In: Advances in Neural Information Processing Systems (2003)

    Google Scholar 

  16. El-Telbany, M.: Reinforcement Learning Algorithms For Multi-Robot Organization Ph.D. thesis, Faculty of Engineering, Cairo Univrsity (2003)

    Google Scholar 

  17. Erfu, Y., Dongbing, G.: Multiagent reinforcement learning for multi-robot systems: a survey. Technical report (2004)

    Google Scholar 

  18. Ng, A.: Sha** and policy search in Reinforcement Learning. Ph.D., Universityof California, Berkeley (2003)

    Google Scholar 

  19. Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. The MIT Press, Cambridge (2012)

    MATH  Google Scholar 

  20. Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8, 279–292 (1992)

    MATH  Google Scholar 

  21. Grondman, I., Busoniu, L., Lopes, G., Babuka, R.: A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Trans. Syst. Man Cybern Part C: Appl. Rev. 4(2), 39–47 (2012)

    Google Scholar 

  22. Heidrich-Meisner, V., Lauer, M., Igel, C., Riedmiller, M.: Reinforcement learning in a nutshell. In: 15th European Symposium on Artificial Neural Networks (ESANN2007), pp. 277–288 (2007)

    Google Scholar 

  23. van Hasselt, H.: Reinforcement learning in continuous state and action spaces. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. ALO, vol. 12, pp. 205–248. Springer, Heidelberg (2012)

    Google Scholar 

  24. Gaskett, C.: Q-Learning for Robot Control. Ph.D., Australian National University (2002)

    Google Scholar 

  25. Lin, L.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. Austr. Natl. Univ. 8, 293–321 (1992)

    Google Scholar 

  26. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. CoRR (2013)

    Google Scholar 

  27. Hagen, S., Krose, B.: Neural Q-learning. Neural Comput. Appl. 21(2), 81–88 (2003)

    Google Scholar 

  28. Takahashi, Y., Takeda, M., Asada, M.: Continuous valued Q-learning for vision-guided behavior. In: International Conference on Multisensor Fusion and Integration for Intelligent Systems (1999)

    Google Scholar 

  29. Smart, W.: Making Reinforcement Learning Work on Real Robots. Ph.D., BrownUniversity (2002)

    Google Scholar 

  30. Duff, M.: Optimal learning: computational procedures for bayes adaptive Markov decision processes. Ph.D. dissertation, University of Massachusetts (2002)

    Google Scholar 

  31. Carden, S.: Convergence of a Q-learning variant for continuous states and actions. J. Artif. Intell. Res. 49, 705–731 (2014)

    MathSciNet  MATH  Google Scholar 

  32. Nadaraya, E.: On estimating regression. Theory Prob. Appl. 9(1), 141–142 (1964)

    MATH  Google Scholar 

  33. Kirk, D.: Optimal Control Theory. An Introduction. Prentice Hall, Englewood Cliffs (1970)

    Google Scholar 

  34. Anderson, B., Moore, J.: Optimal Control: Linear Quadratic Methods. Prentice Hall, Upper Saddle River (1989)

    Google Scholar 

  35. Ng, A.Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E., Liang, E.: Autonomous inverted helicopter flight via reinforcement learning. In: Ang, M.H., Khatib, O. (eds.) Experimental Robotics IX. STAR, vol. 21, pp. 363–372. Springer, Heidelberg (2006). doi:10.1007/11552246_35

    Google Scholar 

  36. Ng, A., Harada, D., Russell, S.: Potential-based sha** in model-based reinforcement learning. In: Proceedings of the 16th International Conference on Machine Learning (1999)

    Google Scholar 

  37. Matric, M.: Reward functions for accelerated learning. In: Proceedings of the 11th International Conference on Machine Learning, pp. 181–189 (1994)

    Google Scholar 

  38. Konidaris, G., Barto, A.: Autonomous sha**: Knowledge transfer in reinforcement learning. In: Proceedings of the 23th International Conference on Machine Learning (2006)

    Google Scholar 

  39. Asmuth, J., Littman, M., Zinkov, R.: Potential-based sha** in model-based reinforcement learning. In: Proceedings of AAAI Conference on Artificial Intelligence (2008)

    Google Scholar 

  40. Thrun, S.: The role of exploration in learning control. In: White, D., Sofg, D. (eds.) Handbook of Intelligent Control: Neural, Fuzzy and Adaptive Approaches (1992)

    Google Scholar 

  41. Brafman, R., Tennenholtz, M.: R-max a general polynomial time algorithm for near optimal reinforcement learning. J. Mach. Learn. Res. 3, 213–231 (2002)

    MathSciNet  MATH  Google Scholar 

  42. Garcia, J., Fernandez, F.: Safe exploration of state and action spaces in reinforcement learning. J. Artif. Intell. Res. 45, 515–564 (2012)

    MathSciNet  MATH  Google Scholar 

  43. Peters, J., Mulling, K., Altun, Y.: Relative entropy policy search. In: Proceedings of the National Conference on Artificial Intelligence, pp. 1607–1612 (2010)

    Google Scholar 

  44. Deisenroth, M., Rasmussen, C., Fox, D.: Learning to control a low-cost manipulator using data-efficient reinforcement learning. In: Proceedings of the International Conference on Robotics: Science and Systems (2011)

    Google Scholar 

  45. Wiewiora, E., Cottrell, G., Elkan, C.: Principled methods for advising reinforcement learning agents. In: ICML, pp. 792–799 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed E. El-Telbany .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

El-Telbany, M.E. (2017). The Challenges of Reinforcement Learning in Robotics and Optimal Control. In: Hassanien, A., Shaalan, K., Gaber, T., Azar, A., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016. AISI 2016. Advances in Intelligent Systems and Computing, vol 533. Springer, Cham. https://doi.org/10.1007/978-3-319-48308-5_84

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48308-5_84

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48307-8

  • Online ISBN: 978-3-319-48308-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Navigation