Route searching based on neural networks and heuristic reinforcement learning

Zhang, Fengyun; Duan, Shukai; Wang, Lidan

doi:10.1007/s11571-017-9423-7

Route searching based on neural networks and heuristic reinforcement learning

Research Article
Published: 09 February 2017

Volume 11, pages 245–258, (2017)
Cite this article

Cognitive Neurodynamics Aims and scope Submit manuscript

520 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, an improved and much stronger RNH-QL method based on RBF network and heuristic Q-learning was put forward for route searching in a larger state space. Firstly, it solves the problem of inefficiency of reinforcement learning if a given problem’s state space is increased and there is a lack of prior information on the environment. Secondly, RBF network as weight updating rule, reward sha** can give an additional feedback to the agent in some intermediate states, which will help to guide the agent towards the goal state in a more controlled fashion. Meanwhile, with the process of Q-learning, it is accessible to the underlying dynamic knowledge, instead of the need of background knowledge of an upper level RBF network. Thirdly, it improves the learning efficiency by incorporating the greedy exploitation strategy to train the neural network, which has been testified by the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

A learning search algorithm with propagational reinforcement learning

Article 22 March 2021

An Algorithm for Path Planning Based on Improved Q-Learning

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

References

Bianchi R, Ribeiro C, Costa A (2008) Accelerating autonomous learning by using heuristic selection of actions. J Heuristics 14(2):135–168
Article Google Scholar
Bianchi R, Martins M, Ribeiro C et al (2014) Heuristically-accelerated multiagent reinforcement learning. IEEE Trans Cybern 44(2):252–265
Article PubMed Google Scholar
Chen C, Li HX, Dong D (2008) Hybrid control for robot navigation—a hierarchical Q-learning algorithm. IEEE Robot Autom Mag 15(2):37–47
Article Google Scholar
Chen C, Dong D, Li H et al (2011) Hybrid MDP based integrated hierarchical Q-learning. Sci China Inf Sci 54(11):2279–2294
Article Google Scholar
Chen H, Gong Y, Hong X et al (2016) A fast adaptive tunable RBF network for nonstationary systems. IEEE Trans Cybern 46(12):2683–2692
Article PubMed Google Scholar
Cruz DP, Maia RD, da Silva LA et al (2014) A bee-inspired data clustering approach to design RBF neural network classifiers. In: Distributed computing and artificial intelligence, 11th international conference. Springer International Publishing, pp 545–552
Devlin S, Kudenko D (2016) Plan-based reward sha** for multi-agent reinforcement learning. Knowl Eng Rev 31(1):44–58
Article Google Scholar
Duan SK, Hu XF, Dong ZK (2015a) Memristor-based cellular nonlinear/neural network: design, analysis and applications. IEEE Trans Neural Netw Learn Syst 26(6):1202–1213
Article PubMed Google Scholar
Duan SK, Wang HM, Wang LD (2015b) Impulsive effects and stability analysis on memristive neural networks with variable delays. IEEE Trans Neural Netw Learn Syst. doi:10.1109/TNNLS.2015.2497319
Google Scholar
Ferreira L, Ribeiro C, da Costa Bianchi R (2014) Heuristically accelerated reinforcement learning modularization for multi-agent multi-objective problems. Appl Intell 41(2):551–562
Article Google Scholar
Gosavi A (2014) Simulation-based optimization: parametric optimization techniques and reinforcement learning. Springer, Berlin
Google Scholar
Grzes M, Kudenko D (2008) Plan-based reward sha** for reinforcement learning. In: 4th International IEEE conference on intelligent systems, 2008. IS’08. IEEE, vol 2, pp 10-22–10-29
Grzes M, Kudenko D (2010) Online learning of sha** rewards in reinforcement learning. Neural Netw 23(4):541–550
Article PubMed Google Scholar
Gu Y, Liljenström H (2007) A neural network model of attention-modulated neurodynamics. Cogn Neurodyn 1(4):275–285
Article PubMed PubMed Central Google Scholar
Holroyd CB, Coles M (2002) The neural basis of human error processing-reinforcement learning, dopamine, and the error-related negativity. Psychol Rev 109(4):679
Article PubMed Google Scholar
Kozma R (2016) Reflections on a giant of brain science. Cogn Neurodyn 10(6):457–469
Article PubMed Google Scholar
Li TS, Duan SK, Liu J et al (2015) A spintronic memristor-based neural network with radial basis function for robotic manipulator control implementation. IEEE Trans Syst Man Cybern Syst. doi:10.1109/TSMC.2015.2453138
Google Scholar
Lin F, Shi C, Luo J (2008) Dual reinforcement learning based on bias learning. J Comput Res Dev 45(9):1455–1462
Google Scholar
Liu Z, Zeng Q (2012) A method of heuristic reinforcement learning based on acquired path guiding knowledge. J Sichuan Univ Eng Sci Ed 44(5):136–142
Google Scholar
Liu Y, Wang Wang R, Zhang Z et al (2010) Analysis of stability of neural network with inhibitory neurons. Cogn Neurodyn 4(1):61–68
Article PubMed Google Scholar
Liu C, Xu X, Hu D (2013) Multiobjective reinforcement learning—a comprehensive overview. IEEE Trans Syst Man Cybern Syst 99(4):1–13
Google Scholar
Millan J, Torras C (2014) Learning to avoid obstacles through reinforcement. In: Proceedings of the 8th international workshop on machine learning, pp 298–302
Minsky M (1954) Neural nets and the brain-model problem. Unpublished doctoral dissertation. Princeton University, NJ
Ng AY, Harada D, Russell S (1999) Policy invariance under reward transformations: theory and application to reward sha**. In: ICML, vol 99, pp 278–287
Ni Z, He H, Zhao D et al (2012) Reinforcement learning control based on multi-goal representation using hierarchical heuristic dynamic programming. In: The 2012 international joint conference on neural networks (IJCNN). IEEE, vol 1, no. 8
Qian Y, Yu Y, Zhou Z (2013) Sha** reward learning approach from passive sample. J Softw 24(11):2667–2675
Article Google Scholar
Samson RD, Frank MJ, Fellous JM (2010) Computational models of reinforcement learning: the role of dopamine as a reward signal. Cogn Neurodyn 4(2):91–105
Article CAS PubMed PubMed Central Google Scholar
Sutton RS, Barto AG (1998) Introduction to reinforcement learning. MIT Press, Cambridge
Google Scholar
Wang H, Wang Q, Lu Q et al (2013) Equilibrium analysis and phase synchronization of two coupled HR neurons with gap junction. Cogn Neurodyn 7(2):121–131
Article PubMed Google Scholar
Wang HM, Duan SK, Huang TW et al (2016a) Novel stability criteria for impulsive memristive neural networks with time-varying delays. Circuits Syst Signal Process 35(11):3935–3956
Article Google Scholar
Wang HM, Duan SK, Li CD et al (2016b) Globally exponential stability of delayed impulsive functional differential systems with impulse time windows. Nonlinear Dyn 84(3):1655–1665
Article Google Scholar
Watkins C, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
Google Scholar
Zhong YP, Duan SK, Zhang FY et al (2013) An intelligent control system based on neural networks and reinforcement learning. J Southwest Univ (Natural Science Edition) 35(11):172–179
Google Scholar

Download references

Acknowledgements

The work was supported by National Natural Science Foundation of China (Grant Nos. 61372139, 61571372, 61672436), Program for New Century Excellent Talents in University (Grant No. [2013]47), Fundamental Research Funds for the Central Universities (Grant Nos. XDJK2016D008, XDJK2016A001, XDJK2014A009), Program for Excellent Talents in scientific and technological activities for Overseas Scholars, Ministry of Personnel in China (Grant No. 2012-186).

Author information

Authors and Affiliations

Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, Southwest University, Chongqing, 400715, People’s Republic of China
Fengyun Zhang, Shukai Duan & Lidan Wang
College of Electronic and Information Engineering, Southwest University, Chongqing, 400715, People’s Republic of China
Fengyun Zhang, Shukai Duan & Lidan Wang

Authors

Fengyun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shukai Duan
View author publications
You can also search for this author in PubMed Google Scholar
Lidan Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shukai Duan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, F., Duan, S. & Wang, L. Route searching based on neural networks and heuristic reinforcement learning. Cogn Neurodyn 11, 245–258 (2017). https://doi.org/10.1007/s11571-017-9423-7

Download citation

Received: 28 April 2016
Revised: 13 December 2016
Accepted: 24 January 2017
Published: 09 February 2017
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11571-017-9423-7

Keywords

Access this article

Log in via an institution

Price includes VAT (Canada)

Instant access to the full article PDF.

Institutional subscriptions

Route searching based on neural networks and heuristic reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A learning search algorithm with propagational reinforcement learning

An Algorithm for Path Planning Based on Improved Q-Learning

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Route searching based on neural networks and heuristic reinforcement learning

Abstract

Access this article

Similar content being viewed by others

A learning search algorithm with propagational reinforcement learning

An Algorithm for Path Planning Based on Improved Q-Learning

Research on Path Planning Algorithm for Mobile Robot Based on Improved Reinforcement Learning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation