Online Versus Offline Reinforcement Learning for False Target Control Against Known Threat

Nguyen, Duong D.; Rajagopalan, Arvind; Lim, Cheng-Chew

doi:10.1007/978-3-319-97589-4_34

Duong D. Nguyen¹⁷,
Arvind Rajagopalan¹⁸ &
Cheng-Chew Lim¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10985))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

3153 Accesses
4 Citations

Abstract

In this article, we investigate the performances of different learning approaches for decentralised non-cooperative multi-agent system applied to defend a high-value target from multiple aerial threats for an air defence application. We focus mainly on reinforcement learning (RL) techniques for protection against known fully observable threats with high mobility. We implement two well-known algorithms from two different approaches, including the regret matching (online learning) and the Q-learning with artificial neural networks (offline learning), and compare them to understand their efficiency. Numerical experiments are provided to illustrate the performances of the different learning algorithms under various approaching directions of the threat as well as under collision avoidance with both static and moving obstacles. Finally, discussions for further improvements of these RL techniques are also provided.

This work was supported by the Collaborative Project Agreement between DST Group and The University of Adelaide. The authors would like to thank Dr. Jijoong Kim for his valuable comments and suggestions to improve the quality of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reinforcement Learning Aided Routing in Tactical Wireless Sensor Networks

Multi-agent Deep Reinforcement Learning for Countering Uncrewed Aerial Systems

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Article 07 September 2019

Notes

1.
The notion of unconditional regret involves an agent reasoning about replacing each action chose by a fixed strategy (see [10] for more details and discussion).

References

Tesauro, G., Jong, N.K., Das, R., Bennani, M.N.: A hybrid reinforcement learning approach to autonomic resource allocation. In: 2006 IEEE International Conference on Autonomic Computing, pp. 65–73 (2006)
Google Scholar
Maskery, M., Krishnamurthy, V., Oregan, C.: Decentralized algorithms for netcentric force protection against antiship missiles. IEEE Trans. Aerosp. Electron. Syst. 43(4), 1351–1372 (2007)
Article Google Scholar
Shames, I., Dostovalova, A., Kim, J., Hmam, H.: Task allocation and motion control for threat-seduction decoys. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 4509–4514 (2017)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing Atari with Deep Reinforcement Learning. CoRR (2013)
Google Scholar
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)
Article Google Scholar
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: International Conference on Machine Learning, pp. 1329–1338 (2016)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)
Article Google Scholar
Shah, D., **e, Q.: Q-learning with nearest neighbors. ar**v preprint ar**v:1802.03900 (2018)
Yanushevsky, R.: Modern Missile Guidance. CRC Press, Boca Raton (2007)
Book Google Scholar
Hart, S., Mas-Colell, A.: A simple adaptive procedure leading to correlated equilibrium. Econometrica 68(5), 1127–1150 (2000)
Article MathSciNet Google Scholar
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989)
Article MathSciNet Google Scholar
de Souza e Silva, E., Ochoa, P.M.: State space exploration in Markov models. SIGMETRICS Perform. Eval. Rev. 20(1), 152–166 (1992)
Article Google Scholar
Liu, S., Maljovec, D., Wang, B., Bremer, P.-T., Pascucci, V.: Visualizing high-dimensional data: advances in the past decade. IEEE Trans. Visual Comput. Graphics 23(3), 1249–1268 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Electronic Engineering, The University of Adelaide, Adelaide, SA, 5005, Australia
Duong D. Nguyen & Cheng-Chew Lim
Weapons and Combat Systems Division, Defence Science and Technology Group, Edinburgh, SA, 5111, Australia
Arvind Rajagopalan

Authors

Duong D. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Arvind Rajagopalan
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Chew Lim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Duong D. Nguyen .

Editor information

Editors and Affiliations

University of Newcastle, Callaghan, New South Wales, Australia
Zhiyong Chen
University of Newcastle, Callaghan, New South Wales, Australia
Alexandre Mendes
University of Newcastle, Callaghan, New South Wales, Australia
Yamin Yan
Shenzhen Institutes of Advanced Technology, Shenzhen, China
Shifeng Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, D.D., Rajagopalan, A., Lim, CC. (2018). Online Versus Offline Reinforcement Learning for False Target Control Against Known Threat. In: Chen, Z., Mendes, A., Yan, Y., Chen, S. (eds) Intelligent Robotics and Applications. ICIRA 2018. Lecture Notes in Computer Science(), vol 10985. Springer, Cham. https://doi.org/10.1007/978-3-319-97589-4_34

Download citation

DOI: https://doi.org/10.1007/978-3-319-97589-4_34
Published: 04 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97588-7
Online ISBN: 978-3-319-97589-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Online Versus Offline Reinforcement Learning for False Target Control Against Known Threat

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement Learning Aided Routing in Tactical Wireless Sensor Networks

Multi-agent Deep Reinforcement Learning for Countering Uncrewed Aerial Systems

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Online Versus Offline Reinforcement Learning for False Target Control Against Known Threat

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement Learning Aided Routing in Tactical Wireless Sensor Networks

Multi-agent Deep Reinforcement Learning for Countering Uncrewed Aerial Systems

Towards Real-Time Path Planning through Deep Reinforcement Learning for a UAV in Dynamic Environments

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation