Optimal Strategy for Aircraft Pursuit-evasion Games via Self-play Iteration

Wang, **n; Wei, Qing-Lai; Li, Tao; Zhang, Jie

doi:10.1007/s11633-022-1413-5

Optimal Strategy for Aircraft Pursuit-evasion Games via Self-play Iteration

Research Article
Published: 12 January 2024

Volume 21, pages 585–596, (2024)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

143 Accesses
1 Altmetric
Explore all metrics

Abstract

In this paper, the pursuit-evasion game with state and control constraints is solved to achieve the Nash equilibrium of both the pursuer and the evader with an iterative self-play technique. Under the condition where the Hamiltonian formed by means of Pontryagin’s maximum principle has the unique solution, it can be proven that the iterative control law converges to the Nash equilibrium solution. However, the strong nonlinearity of the ordinary differential equations formulated by Pontryagin’s maximum principle makes the control policy difficult to figured out. Moreover the system dynamics employed in this manuscript contains a high dimensional state vector with constraints. In practical applications, such as the control of aircraft, the provided overload is limited. Therefore, in this paper, we consider the optimal strategy of pursuit-evasion games with constant constraint on the control, while some state vectors are restricted by the function of the input. To address the challenges, the optimal control problems are transformed into nonlinear programming problems through the direct collocation method. Finally, two numerical cases of the aircraft pursuit-evasion scenario are given to demonstrate the effectiveness of the presented method to obtain the optimal control of both the pursuer and the evader.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

No-Escape Envelope with Field of Regard Constraint using Gradient-Based Direct Method for Pursuit-Evasion Games

Article 01 August 2018

Three-agent Time-constrained Cooperative Pursuit-Evasion

Article 22 January 2022

Pursuit in the Presence of a Defender

Article 06 July 2018

References

R. Isaacs. Differential Gaines: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization, New York, USA: Dover Publications, 1999.
Google Scholar
P. K. Chintagunta, V. R. Rao. Pricing strategies in a dynamic duopoly: A differential game model. Management Science, vol. 42, no. 11, pp. 1501–1514, 1996. DOI: https://doi.org/10.5555/2777472.2777473.
Article Google Scholar
L. A. Petrosyan, N. A. Zenkevich. Game Theory, Singapore: World Scientific Publishing Co Pte Ltd, 1996.
Book Google Scholar
Y. Mousavi, A. Zarei, A. Mousavi, M. Biari. Robust optimal higher-order-observer-based dynamic sliding mode control for VTOL unmanned aerial vehicles. International Journal of Automation and Computing, vol. 18, no. 5, pp. 802–813, 2021. DOI: https://doi.org/10.1007/s11633-021-1282-3.
Article Google Scholar
H. G. Zhang, Q. L. Wei, D. R. Liu. An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica, vol. 47, no. 1, pp. 207–214, 2011. DOI: https://doi.org/10.1016/j.automatica.2010.10.033.
Article MathSciNet Google Scholar
N. Greenwood. A differential game in three dimensions: The aerial dogfight scenario. Dynamics and Control, vol. 2, no. 2, pp. 161–200, 1992. DOI: https://doi.org/10.1007/BF02169496.
Article MathSciNet Google Scholar
K. Horie, B. A. Conway. Optimal fighter pursuit-evasion maneuvers found via two-sided optimization. Journal of Guidance, Control, and Dynamics, vol. 29, no. 1, pp. 105–112, 2006. DOI: https://doi.org/10.2514/1.3960.
Article Google Scholar
Z. Y. Li, H. Zhu, Z. Yang, Y. Z. Luo. A dimension-reduction solution of free-time differential games for spacecraft pursuit-evasion. Acta Astronautica, vol. 163, pp.201-210, 2019. DOI: https://doi.org/10.1016/j.actaastro.2019.01.011.
J. F. Zhou, L. Zhao, H. Li, J. H. Cheng, S. Wang. Compensation control strategy for orbital pursuit-evasion problem with imperfect information. Applied Sciences, vol.11, no.4, Article number 1400, 2021. DOI: https://doi.org/10.3390/app11041400.
M. Salimi, M. Ferrara. Differential game of optimal pursuit of one evader by many pursuers. International Journal of Game Theory, vol. 48, no. 2, pp. 481–490, 2019. DOI: https://doi.org/10.1007/s00182-018-0638-6.
Article MathSciNet Google Scholar
V. G. Lopez, F. L. Lewis, Y. Wan, E. N. Sanchez, L. L. Fan. Solutions for multiagent pursuit-evasion games on communication graphs: Finite-time capture and asymptotic behaviors. IEEE Transactions on Automatic Control, vol. 65, no. 5, pp. 1911–1923, 2020. DOI: https://doi.org/10.1109/TAC.2019.2926554.
Article MathSciNet Google Scholar
E. Garcia, D. W. Casbeer, A. von Moll, M. Pachter. Multiple pursuer multiple evader differential games. IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 2345–2350, 2021. DOI: https://doi.org/10.1109/TAC.2020.3003840.
Article MathSciNet Google Scholar
D. W. Oyler. Contributions to Pursuit-Evasion Game Theory, Ph.D. dissertation, University of Michigan, USA, 2016.
Google Scholar
D. Wang, M. M. Ha, M. M. Zhao. The intelligent critic framework for advanced optimal control. Artificial Intelh-gence Review, vol. 55, no. 1, pp. 1–22, 2022. DOI: https://doi.org/10.1007/s10462-021-10118-9.
Article Google Scholar
P. Soravia. Pursuit-evasion problems and viscosity solutions of isaacs equations. SIAM Journal on Control and Optimization, vol. 31, no. 3, pp. 604–623, 1993. DOI: https://doi.org/10.1137/0331027.
Article MathSciNet Google Scholar
Q. L. Wei, D. R. Liu, Q. Lin, R. Z. Song. Adaptive dynamic programming for discrete-time zero-sum games. IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 4, pp. 957–969, 2018. DOI: https://doi.org/10.1109/TNNLS.2016.2638863.
Article Google Scholar
L. S. Pontryagin. Mathematical Theory of Optimal Processes, Boca Raton, USA: CRC Press, 1987.
Google Scholar
R. W. Carr, R. G. Cobb, M. Pachter, S. Pierce. Solution of a pursuit-evasion game using a near-optimal strategy. Journal of Guidance, Control, and Dynamics, vol. 41, no. 4, pp. 841–850, 2018. DOI: https://doi.org/10.2514/1.G002911.
Article Google Scholar
M. Pontani, B. A. Conway. Numerical solution of the three-dimensional orbital pursuit-evasion game. Journal of Guidance, Control, and Dynamics, vol. 32, no. 2, pp. 474–487, 2009. DOI: https://doi.org/10.2514/1.37962.
Article Google Scholar
Y. L. Yang, K. G. Vamvoudakis, H. Modares. Safe reinforcement learning for dynamical games. International Journal of Robust and Nonlinear Control, vol. 30, no. 2, pp. 3706–3726, 2020. DOI: https://doi.org/10.1002/rnc.4962.
Article MathSciNet Google Scholar
M. M. Ha, D. Wang, D. R. Liu. Discounted iterative adaptive critic designs with novel stability analysis for tracking control. IEEE/CAA Journal of Automatica Sanica, vol. 9, no. 7, pp. 1262–1272, 2022. DOI: https://doi.org/10.1109/JAS.2022.105692.
Article Google Scholar
Y. Yang, D. Ding, H. **ong, Y. Yin, D. Wunsch. Online barrier-actor-critic learning for H∞, control with full-state constraints and input saturation. Journal of the Franklin Institute, vol. 357, no. 7, pp. 3316–3344, 2020. DOI: https://doi.org/10.1016/j.jfranklin.2019.12.017.
Article MathSciNet Google Scholar
Y. Kartal, K. Subbarao, A. Dogan, F. Lewis. Optimal game theoretic solution of the pursuit-evasion intercept problem using on-policy reinforcement learning. International Journal of Robust and Nonhnear Control, vol. 31, no. 16, pp. 7886–7903, 2021. DOI: https://doi.org/10.1002/rnc.5719.
MathSciNet Google Scholar
J. Selvakumar, E. Bakolas. Feedback strategies for a reach-avoid game with a single evader and multiple pursuers. IEEE Transactions on Cybernetics, vol. 51, no. 2, pp. 696–707, 2021. DOI: https://doi.org/10.1109/TCYB.2019.2914869.
Article Google Scholar
H. Xu. Finite-horizon near optimal design of nonhnear two-player zero-sum game in presence of completely unknown dynamics. Journal of Control, Automation and Electrical Systems, vol. 26, no. 4, pp. 361–370, 2015. DOI: https://doi.org/10.1007/s40313-015-0180-8.
Article Google Scholar
C. X. Mu, K. Wang, C. Y. Sun. Policy-iteration-based learning for nonlinear player game systems with constrained inputs. IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 10, pp. 6488–6502, 2021. DOI: https://doi.org/10.1109/TSMC.2019.2962629.
Article Google Scholar
X. H. Cui, H. G. Zhang, Y. H. Luo, P. F. Zu. Online finite-horizon optimal learning algorithm for nonzero-sum games with partially unknown dynamics and constrained inputs. Neurocomputing, vol. 185, pp. 37–44, 2016. DOI: https://doi.org/10.1016/j.neucom.2015.12.021.
Article Google Scholar
I. E. Weintraub, M. Pachter, E. Garcia. An introduction to pursuit-evasion differential games. In Proceedings of American Control Conference, IEEE, Denver, USA, pp. 1049–1066, 2020. DOI: https://doi.org/10.23919/ACC45564.2020.9147205.
Google Scholar
M. H. Breitner, H. J. Pesch, W. Grimm. Complex differential games of pursuit-evasion type with state constraints, Part 1: Necessary conditions for optimal open-loop strategies. Journal of Optimization Theory and Applications, vol. 78, no. 3, pp. 419–441, 1993. DOI: https://doi.org/10.1007/BF00939876.
Article MathSciNet Google Scholar
A. Bressan. Noncooperative differential games. Milan Journal of Mathematics, vol. 79, pp. 357–427, 2011.
Article MathSciNet Google Scholar
A. S. El-Bakry, R. A. Tapia, T. Tsuchiya, Y. Zhang. On the formulation and theory of the newton interior-point method for nonhnear programming. Journal of Optimization Theory and Applications, vol. 89, no. 3, pp. 507–541, 1996. DOI: https://doi.org/10.1007/BF02275347.
Article MathSciNet Google Scholar
P. T. Boggs, J. W. Tolle. Sequential quadratic programming. Acta Numerica, vol. 4, pp. 1–51, 1995. DOI: https://doi.org/10.1017/S0962492900002518.
Article MathSciNet Google Scholar
A. R. Conn, N. I. M. Gould, P. L. Toint. Trust-Region Methods, Philadelphia, USA: SIAM, 2000.
Book Google Scholar
F. Austin, G. Carbone, M. Falco, H. Hinz, M. Lewis. Automated maneuvering decisions for air-to-air combat. In Proceedings of Guidance, Navigation and Control Conference, Monterey, USA, pp. 659–669, 1987. DOI: https://doi.org/10.2514/6.1987-2393.

Download references

Author information

Authors and Affiliations

State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Bei**g, 100190, China
**n Wang, Qing-Lai Wei, Tao Li & Jie Zhang
School of Artificial Intelligence, University of Chinese Academy of Sciences, Bei**g, 100049, China
**n Wang, Qing-Lai Wei, Tao Li & Jie Zhang
Institute of Systems Engineering, Macau University of Science and Technology, Macau, 999078, China
Qing-Lai Wei

Authors

**n Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qing-Lai Wei
View author publications
You can also search for this author in PubMed Google Scholar
Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Jie Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Zhang.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://springer.longhoe.net/journal/11633

**n Wang received the B.Sc. degree in electronic information engineering from Zhengzhou University, China in 2012, and the M.Sc. degree in control engineering from University of Science and Technology Bei**g, China in 2015. He is currently a Ph.D. degree candidate in control theory and control engineering at State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences China, and University of Chinese Academy of Sciences, China.

His research interests include reinforcement learning, adaptive dynamic programming, optimal control and multi-agent system.

Qing-Lai Wei received the B.Sc. degree in automation, and the Ph.D. degree in control theory and control engineering from Northeastern University, China in 2002 and 2009, respectively. From 2009 to 2011, he was a postdoctoral fellow with State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, China. He is currently a professor of the institute and the associate director of the laboratory. He has authored four books, and published over 80 international journal papers. He is the Secretary of IEEE Computational Intelligence Society (CIS) Bei**g Chapter since 2015. He was Guest Editors for several international journals. He was a recipient of IEEE/CAA Journal of Automatica Sinica Best Paper Award, IEEE System, Man, and Cybernetics Society Andrew P. Sage Best Transactions Paper Award, IEEE Transactions on Neural Networks and Learning Systems Outstanding Paper Award, the Outstanding Paper Award of Acta Automatica Sinica, IEEE 6th Data Driven Control and Learning Systems Conference (DDCLS2017) Best Paper Award, and Zhang Siying Outstanding Paper Award of Chinese Control and Decision Conference (CCDC). He was a recipient of Shuang-Chuang Talents in Jiangsu Province, China, Young Researcher Award of Asia Pacific Neural Network Society (APNNS), Young Scientist Award and Yang Jiachi Tech Award of Chinese Association of Automation (CAA). He is a Board of Governors (BOG) member of the International Neural Network Society (INNS) and a council member of CAA.

His research interests include adaptive dynamic programming, neural-net works-based control, optimal control, nonlinear systems and their industrial applications.

Tao Li received the B.Sc. degree in automation from Northeastern University, China in 2019. He is currently a Ph. D. degree candidate in control theory and control engineering at State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences, China.

His research interests include adaptive dynamic programming, reinforcement learning and approximate dynamic programming.

Jie Zhang received the B. Sc. degree in information and computing science from Tsinghua University, China in 2005, and the Ph.D. degree in technology of computer application from University of Chinese Academy of Sciences, China in 2015. He has been an associate professor with State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, China, since 2016.

His research interests include parallel control, mechanism design, optimal control and multiagent reinforcement learning.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, X., Wei, QL., Li, T. et al. Optimal Strategy for Aircraft Pursuit-evasion Games via Self-play Iteration. Mach. Intell. Res. 21, 585–596 (2024). https://doi.org/10.1007/s11633-022-1413-5

Download citation

Received: 07 August 2022
Accepted: 28 December 2022
Published: 12 January 2024
Issue Date: June 2024
DOI: https://doi.org/10.1007/s11633-022-1413-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Strategy for Aircraft Pursuit-evasion Games via Self-play Iteration

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

No-Escape Envelope with Field of Regard Constraint using Gradient-Based Direct Method for Pursuit-Evasion Games

Three-agent Time-constrained Cooperative Pursuit-Evasion

Pursuit in the Presence of a Defender

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation