Abstract
In order to deal with the Multi-Domain Operation (MDO) proposed by the US military, our army has begun to focus on develo** its joint operation capabilities, particularly the capability of multi-domain cooperative operation scheduling. Optimal scheduling allows for a reduction of operation costs and time. However, finding such schedules is usually difficult in a large-scale and complex environment, which is extremely prone to non-optimal and non-convergent behaviors. Recent advances in Deep Reinforcement Learning (DRL) in learning complex behaviors allow new approach options. This paper presents an efficient DRL environment for multi-domain cooperative operation scheduling in order to arrange suitable scheduling in MDO, in which multi-domain operation scheduling is modeled as a DRL problem. In addition, we present an end-to-end DRL-based method for automatically learning the multi-domain operation scheduling problem. Furthermore, we devise an action selection method that satisfies the task priority constraints. Finally, we provide a multi-domain operation instance, and the experiment demonstrates the effectiveness and applicability of the environment and method proposed in this paper for dealing with the multi-domain cooperative operation scheduling challenge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhang, W.M., Huang, S.P., Huang, J.C., et al.: Analysis on multi-domain operation and its command and control problems. Command Inf. Syst. Technol. 11(01), 1–6 (2020)
Liu, K.: Theoretical thinking on the joint all-domain command and control system of the U.S. army. J. China Acad. Electron. Inf. Technol. 16(07), 722–727 (2021)
Liu, Y., Zhou, J., Lim, A., et al.: Lower bounds and heuristics for the Unit-Capacity resource constrained project scheduling problem with transfer time. Comput. Ind. Eng. 161, 107605 (2021)
Tao, W.Q, Wei, Y.: Overview of unified management and scheduling technology of battlefield resources. Automat. Instrumen. 261(07), 55–57+63 (2021)
Levchuk, G.M., Levchuk, Y.N., Luo, J., et al.: Normative design of organization: Mission planning. IEEE Trans. Syst. Man Cybern. Part A: Syst. Humans 32(3), 346–359 (2002)
Liu, J., Wang, W., Li, X., et al.: A motif-based mission planning method for UAV swarms considering dynamic reconfiguration. Def. Sci. J. 68(2), 159–166 (2018)
Blazewicz, J., Lenstra, J.K., Kan, A.H.G.R.: Scheduling subject to resource constraints: classification and complexity. Discret. Appl. Math. 5(1), 11–24 (1983)
Li, J.L., Wang, P., Lv, Z.G., et al.: An improved genetic algorithm for resource scheduling under combat command and control. Comput. Appli. Soft. 39(02), 55–62 (2022)
Poppenborg, J., Knust, S.: A flow-based tabu search algorithm for the RCPSP with transfer times. OR Spectrum 38(2), 305–334 (2015). https://doi.org/10.1007/s00291-015-0402-2
Ding, H., Gu, X.: Hybrid of human learning optimization algorithm and particle swarm optimization algorithm with scheduling strategies for the flexible job-shop scheduling problem. Neurocomputing 414, 313–332 (2020)
Khurshid, B., Maqsood, S., Omair, M., et al.: An improved evolution strategy hybridization with simulated annealing for permutation flow shop scheduling problems. IEEE Access 9, 94505–94522 (2021)
Li, Y., Qiu, X.H., Liu, X.D., et al.: Deep reinforcement learning and its application in autonomous fitting optimization for attack areas of UCAVs. J. Syst. Eng. Electron. 31(4), 734–742 (2020)
Han, B.A., Yang, J.J.: Research on adaptive job shop scheduling problems based on dueling double DQN. IEEE Access 8, 186474–186495 (2020)
Cheng, F., Huang, Y., Tanpure, B., Sawalani, P., Cheng, L., Liu, C.: Cost-aware job scheduling for cloud instances using deep reinforcement learning. Clust. Comput. 25(1), 619–631 (2021). https://doi.org/10.1007/s10586-021-03436-8
Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms. ar**v preprint ar**v:1707.06347 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction, 2nd edn. MIT Press, Cambridge, USA (2018)
Schulman, J., Levine, S., Abbeel, P., et al.: Trust region policy optimization. Proc. Mach. Learn. Res. 37, 1889–1897 (2015)
Tassel P, Gebser M, Schekotihin K.: A reinforcement learning environment for job-shop scheduling. ar**v preprint ar**v:2104.03760 (2021)
Wan, L.P., Lan, X.G., Zhang, H.B., et al.: A review of deep reinforcement learning theory and application. Pattern Recogn. Artifi. Intell. 32(1), 67–81 (2019)
Pasaraba, W.L.: The Conduct and Assessment of A2C2 Experiment 7. Naval Postgraduate School, Monterey, California (2000)
Acknowledgments
Research for this paper was supported by the Equipment advance research project (50912020401), and the Aviation Science Foundation of China (201908052002). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Chinese Institute of Command and Control
About this paper
Cite this paper
He, Z., Liu, H., Huang, K., Cheng, G. (2022). An Intelligent Scheduling Method for Multi-domain Cooperative Operation Based on Deep Reinforcement Learning. In: Proceedings of 2022 10th China Conference on Command and Control. C2 2022. Lecture Notes in Electrical Engineering, vol 949. Springer, Singapore. https://doi.org/10.1007/978-981-19-6052-9_47
Download citation
DOI: https://doi.org/10.1007/978-981-19-6052-9_47
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6051-2
Online ISBN: 978-981-19-6052-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)