Abstract
In this article, we investigate the performances of different learning approaches for decentralised non-cooperative multi-agent system applied to defend a high-value target from multiple aerial threats for an air defence application. We focus mainly on reinforcement learning (RL) techniques for protection against known fully observable threats with high mobility. We implement two well-known algorithms from two different approaches, including the regret matching (online learning) and the Q-learning with artificial neural networks (offline learning), and compare them to understand their efficiency. Numerical experiments are provided to illustrate the performances of the different learning algorithms under various approaching directions of the threat as well as under collision avoidance with both static and moving obstacles. Finally, discussions for further improvements of these RL techniques are also provided.
This work was supported by the Collaborative Project Agreement between DST Group and The University of Adelaide. The authors would like to thank Dr. Jijoong Kim for his valuable comments and suggestions to improve the quality of the paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The notion of unconditional regret involves an agent reasoning about replacing each action chose by a fixed strategy (see [10] for more details and discussion).
References
Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: A hybrid reinforcement learning approach to autonomic resource allocation. In: 2006 IEEE International Conference on Autonomic Computing, pp. 65–73 (2006)
Maskery, M., Krishnamurthy, V., Oregan, C.: Decentralized algorithms for netcentric force protection against antiship missiles. IEEE Trans. Aerosp. Electron. Syst. 43(4), 1351–1372 (2007)
Shames, I., Dostovalova, A., Kim, J., Hmam, H.: Task allocation and motion control for threat-seduction decoys. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 4509–4514 (2017)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing Atari with Deep Reinforcement Learning. CoRR (2013)
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: International Conference on Machine Learning, pp. 1329–1338 (2016)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Shah, D., **e, Q.: Q-learning with nearest neighbors. ar**v preprint ar**v:1802.03900 (2018)
Yanushevsky, R.: Modern Missile Guidance. CRC Press, Boca Raton (2007)
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989)
de Souza e Silva, E., Ochoa, P.M.: State space exploration in Markov models. SIGMETRICS Perform. Eval. Rev. 20(1), 152–166 (1992)
Liu, S., Maljovec, D., Wang, B., Bremer, P.-T., Pascucci, V.: Visualizing high-dimensional data: advances in the past decade. IEEE Trans. Visual Comput. Graphics 23(3), 1249–1268 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, D.D., Rajagopalan, A., Lim, CC. (2018). Online Versus Offline Reinforcement Learning for False Target Control Against Known Threat. In: Chen, Z., Mendes, A., Yan, Y., Chen, S. (eds) Intelligent Robotics and Applications. ICIRA 2018. Lecture Notes in Computer Science(), vol 10985. Springer, Cham. https://doi.org/10.1007/978-3-319-97589-4_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-97589-4_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97588-7
Online ISBN: 978-3-319-97589-4
eBook Packages: Computer ScienceComputer Science (R0)