A Multi-agent Deep Reinforcement Learning Method for UAVs Cooperative Pursuit Problem

Yang, Feng; Shao, Changshun; Shen, Baoyin; Li, Zhi

doi:10.1007/978-981-19-6613-2_699

Feng Yang^40,41,
Changshun Shao^40,41,
Baoyin Shen^40,42 &
…
Zhi Li^40,41

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 845))

Included in the following conference series:

International Conference on Guidance, Navigation and Control

206 Accesses

Abstract

As an important form of intelligent warfare, UAV swarm is emerging. This paper designs a solution for UAV cooperative pursuit scenarios based on MADDPG. The clip** double Q network and policy delay update mechanism are proposed to solve the problems of overestimation of value function and wrong transmission in MADDPG algorithm. Due to the idea of centralized training and distributed execution of MADDPG algorithm and the architecture of constructing evaluation function for each agent, the method in this paper has good scalability and can be effectively applied to the environment of cooperative pursuit task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 469.00; Price excludes VAT (USA)

Softcover Book: USD 599.99; Price excludes VAT (USA)

Hardcover Book: USD 599.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-agent Deep Reinforcement Learning for Pursuit-Evasion Game Scalability

Independent Deep Deterministic Policy Gradient Reinforcement Learning in Cooperative Multiagent Pursuit Games

A Deep Reinforcement Learning Approach for Cooperative Target Defense

References

**, H., Jiaqiang, Z., Jie, Z.: Review on evaluation and theoretical methods of unmanned swarm test. J. Nan**g Univ. Aeronaut. Astronaut. 52, 846–854 (2020)
Google Scholar
Yi-feng, N., **ang-jiang, X., Guan-yan, K.: Operation concept and key techniques of unmanned aerial vehicle swarms. Natl. Defense Sci. Technol. 34, 37–43 (2013)
Google Scholar
Hao, C., Jian, H., Chang, W., Quan, L.: Research on multi-agent cooperation and competition in air combat maneuver. In: The 8th China Command and Control Conference, pp. 454–460 (2020)
Google Scholar
Yang, Y., Wang, J.: An overview of multi-agent reinforcement learning from game theoretical perspective. ar**v preprint ar**v:2011.00583 (2020)
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games. ar**v preprint ar**v:2103.01955 (2021)
Yu, S., Lei, C., **liang, C., Zhixiong, X., Jun, L.: Overview of multi-agent deep reinforcement learning. Comput. Eng. Appl. 56, 13–24 (2020)
Google Scholar
Shuzhe, X., Liangjun, K.: Study On Attack-Defense Countermeasure of UAV swarmsbased on multi-agent einforcement learning. Radio Eng. 51, 360–366 (2021)
Google Scholar
Bohan, W., et al.: Large-scale UAVs confrontation based on multi-agent reinforcement learning. J. Syst. Simul. 33, 1739–1753 (2021)
Google Scholar
Chu, S., Hui, Z., Yuan, W., Huan, Z., **, H.: UCAV autonomic maneuver decision- making method based on reinforcement learning. Fire Control Command Control 44, 142–149 (2019)
Google Scholar
Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: International conference on machine learning, pp. 1587–1596. PMLR (2018)
Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015)
Article Google Scholar
Mnih, V., et al.: Playing atari with deep reinforcement learning. ar**v preprint ar**v:1312.5602 (2013)
Hasselt, H.: Double Q-learning. Adv. Neural Inform. Process. Syst. 23, 2613–2621 (2010)
Google Scholar
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Proceedings of the AAAI conference on artificial intelligence (2016)
Google Scholar
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M.: Deterministic policy gradient algorithms. In: International Conference on Machine Learning, pp. 387–395. PMLR (2014)
Google Scholar
Kakade, S.M.: A natural policy gradient. Adv. Neural Inform. Process. Syst. 14, 1531–1538 (2001)
Google Scholar
Peters, J., Vijayakumar, S., Schaal, S.: Natural actor-critic. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 280–291. Springer, Heidelberg (2005). https://doi.org/10.1007/11564096_29
Chapter Google Scholar
Konda, V., Tsitsiklis, J.: Actor-critic algorithms. Adv. Neural Inform. Process. Syst. 12, 1008–1014 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Automation, Northwestern Polytechnical University, **’an, 710129, China
Feng Yang, Changshun Shao, Baoyin Shen & Zhi Li
Key Laboratory of Information Fusion Technology, Ministry of Educationl, **’an, 710129, China
Feng Yang, Changshun Shao & Zhi Li
The Fourteenth Research Institute of China Electronics Technology Group Corporation, Nan’**g, 210013, China
Baoyin Shen

Authors

Feng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Changshun Shao
View author publications
You can also search for this author in PubMed Google Scholar
Baoyin Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Feng Yang .

Editor information

Editors and Affiliations

Ningbo Institute of Technology, Beihang University, Ningbo, China
Liang Yan
School of Automation Science and Electrical Engineering, Beihang University, Bei**g, Bei**g, China
Haibin Duan
School of Automation Science and Electrical Engineering, Beihang University, Bei**g, Bei**g, China
Yimin Deng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, F., Shao, C., Shen, B., Li, Z. (2023). A Multi-agent Deep Reinforcement Learning Method for UAVs Cooperative Pursuit Problem. In: Yan, L., Duan, H., Deng, Y. (eds) Advances in Guidance, Navigation and Control. ICGNC 2022. Lecture Notes in Electrical Engineering, vol 845. Springer, Singapore. https://doi.org/10.1007/978-981-19-6613-2_699

Download citation

DOI: https://doi.org/10.1007/978-981-19-6613-2_699
Published: 31 January 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6612-5
Online ISBN: 978-981-19-6613-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics