Abstract
In recent years, the occurrence of task failures are becoming prevalent in cloud computing due to various factors such as the increasing complexity of cloud environments, heterogeneity of resources, resource limitations and inadequate allocation. Task failure due to insufficient allocation poses a significant challenge in cloud computing. When tasks are not allocated effectively, they may not be completed within their deadlines which ultimately leads to failure. Hence, effective allocation strategies combined with appropriate fault tolerance measures are vital for addressing these challenges and mitigating the risk of task failures. This paper proposes a fault-tolerant task allocation algorithm (FTTA) for independent tasks with deadline through preemptive migration in heterogeneous cloud environments to reduce task failure. The proposed algorithm involves three phases: the initial phase decides the priority of tasks in the ready list to minimize the execution time and meet task deadlines, the second phase includes the selection of a suitable virtual machine with minimum execution time and the last phase assigns task on available or non-available (which may available in future) virtual machines to find the best execution time within the deadline limit. During the task allocation process, the algorithm adopts fault-tolerant strategy that includes preemptive migration if necessary which allows the migration of tasks to identify the best suitable virtual machine. An analysis of the proposed algorithm reveals that the overall time complexity is \(O(n\log n + n m^2)\) where n is the number of tasks and m is the number of virtual machines. Further, the performance of the algorithm is evaluated for different sets of tasks (small to large) while varying the number of virtual machines. The experimental results demonstrate that FTTA outperforms First Come First Served (FCFS), Priority based algorithm, Shortest Job First (SJF), Dynamic Maximum Minimum (Dy max min) and RADL algorithms in terms of number of rejected tasks, makespan, speedup and efficiency.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Figd_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fige_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs10586-024-04538-9/MediaObjects/10586_2024_4538_Fig14_HTML.png)
Similar content being viewed by others
References
Hu, B., Yang, X., Zhao, M.: Energy-minimized scheduling of intermittent real-time tasks in a CPU-GPU cloud computing platform. In: IEEE Transactions on Parallel and Distributed Systems (2023)
Li, Z., Yu, H., Fan, G., Zhang, J.: Cost-efficient fault-tolerant workflow scheduling for deadline-constrained microservice-based applications in clouds. In: IEEE Transactions on Network and Service Management (2023)
Zhang, L., Bai, J., Xu, J.: Optimal allocation strategy of cloud resources with uncertain supply and demand for SAAS providers. IEEE Access (2023). https://doi.org/10.1109/ACCESS.2023.3300735
Singh, S., Chana, I., Buyya, R.: Star: SLA-aware autonomic management of cloud resources. IEEE Trans. Cloud Comput. 8(4), 1040–1053 (2017)
Taheri, H., Abrishami, S., Naghibzadeh, M.: A cloud broker for executing deadline-constrained periodic scientific workflows. In: IEEE Transactions on Services Computing (2023)
Hai, T., Zhou, J., Jawawi, D., Wang, D., Oduah, U., Biamba, C., Jain, S.K.: Task scheduling in cloud environment: optimization, security prioritization and processor selection schemes. J. Cloud Comput. 12(1), 15 (2023)
Maurya, A.K., Modi, K., Kumar, V., Naik, N.S., Tripathi, A.K.: Energy-aware scheduling using slack reclamation for cluster systems. Clust. Comput. 23, 911–923 (2020)
Chen, X., Lu, C.-D., Pattabiraman, K.: Failure analysis of jobs in compute clouds: a google cluster case study. In: 2014 IEEE 25th International Symposium on Software Reliability Engineering, pp. 167–177. IEEE (2014)
Liakath, J.A., Krishnadoss, P., Natesan, G.: Dccwoa: a multi-heuristic fault tolerant scheduling technique for cloud computing environment. In: Peer-to-Peer Networking and Applications, pp. 1–18 (2023)
Kirti, M., Maurya, A.K., Yadav, R.S.: Fault-tolerance approaches for distributed and cloud computing environments: a systematic review, taxonomy and future directions. In: Concurrency and Computation: Practice and Experience, p. e8081 (2024)
Hamid, L., Jadoon, A., Asghar, H.: Comparative analysis of task level heuristic scheduling algorithms in cloud computing. J. Supercomput. 78(11), 12931–12949 (2022)
Kumar, A.M.S., Venkatesan, M.: Task scheduling in a cloud computing environment using HGPSO algorithm. Clust. Comput. 22(Suppl 1), 2179–2185 (2019)
Kaur, R., Laxmi, V., Balkrishan: Performance evaluation of task scheduling algorithms in virtual cloud environment to minimize Makespan. In: International Journal of Information Technology, pp. 1–15 (2022)
Nabi, S., Ibrahim, M., Jimenez, J.M.: Dralba: dynamic and resource aware load balanced scheduling approach for cloud computing. IEEE Access 9, 61283–61297 (2021)
Mishra, A., Narayan Sahoo, M., Satpathy, A.: H3csa: a Makespan aware task scheduling technique for cloud environments. Trans. Emerg. Telecommun. Technol. 32(10), e4277 (2021)
Nabi, S., Aleem, M., Ahmed, M., Islam, M.A., Iqbal, M.A.: RADL: a resource and deadline-aware dynamic load-balancer for cloud tasks. J. Supercomput. 78(12), 14231–14265 (2022)
Amini Motlagh, A., Movaghar, A., Rahmani, A.M.: Task scheduling mechanisms in cloud computing: a systematic review. Int. J. Commun. Syst. 33(6), e4302 (2020)
Nayak, S.C., Parida, S., Tripathy, C., Pattnaik, P.K.: An enhanced deadline constraint based task scheduling mechanism for cloud environment. J. King Saud Univ. Comput. Inf. Sci. 34(2), 282–294 (2022)
Dubey, K., Sharma, S.C.: A novel multi-objective CR-PSO task scheduling algorithm with deadline constraint in cloud computing. Sustain. Comput. 32, 100605 (2021)
Houssein, E.H., Gad, A.G., Wazery, Y.M., Suganthan, P.N.: Task scheduling in cloud computing based on meta-heuristics: review, taxonomy, open challenges, and future trends. Swarm Evol. Comput. 62, 100841 (2021)
Arunarani, A.R., Manjula, D., Sugumaran, V.: Task scheduling techniques in cloud computing: a literature survey. Future Gener. Comput. Syst. 91, 407–415 (2019)
Zhang, P.Y., Zhou, M.C.: Dynamic cloud task scheduling based on a two-stage strategy. IEEE Trans. Autom. Sci. Eng. 15(2), 772–783 (2017)
Maurya, A.K., Tripathi, A.K.: On benchmarking task scheduling algorithms for heterogeneous computing systems. J. Supercomput. 74(7), 3039–3070 (2018)
He, X., Shen, J., Liu, F., Wang, B., Zhong, G., Jiang, J.: A two-stage scheduling method for deadline-constrained task in cloud computing. Clust. Comput. 25(5), 3265–3281 (2022)
Nabi, S., Ahmed, M.: OG-RADL: overall performance-based resource-aware dynamic load-balancer for deadline constrained cloud tasks. J. Supercomput. 77, 7476–7508 (2021)
Zhang, L., Zhou, L., Salah, A.: Efficient scientific workflow scheduling for deadline-constrained parallel tasks in cloud computing environments. Inf. Sci. 531, 31–46 (2020)
Kumar, M., Sharma, S.C.: Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment. Comput. Electr. Eng. 69, 395–411 (2018)
Alworafi, M.A., Mallappa, S.: A collaboration of deadline and budget constraints for task scheduling in cloud computing. Clust. Comput. 23(2), 1073–1083 (2020)
Nabi, S., Ahmed, M.: PSO-RDAL: particle swarm optimization-based resource-and deadline-aware dynamic load balancer for deadline constrained cloud tasks. J. Supercomput. (2022). https://doi.org/10.1007/s11227-021-04062-2
Maurya, A.K., Tripathi, A.K.: Deadline-constrained algorithms for scheduling of bag-of-tasks and workflows in cloud computing environments. In: Proceedings of the 2nd International Conference on High Performance Compilation, Computing and Communications, pp. 6–10 (2018)
Sahoo, S., Sahoo, B., Turuk, A.K.: A learning automata-based scheduling for deadline sensitive task in the cloud. IEEE Trans. Serv. Comput. 14(6), 1662–1674 (2019)
Tarafdar, A., Debnath, M., Khatua, S., Das, R.K.: Energy and Makespan aware scheduling of deadline sensitive tasks in the cloud environment. J. Grid Comput. 19, 1–25 (2021)
Yan, H., Zhu, X., Chen, H., Guo, H., Zhou, W., Bao, W.: Deft: dynamic fault-tolerant elastic scheduling for tasks with uncertain runtime in cloud. Inf. Sci. 477, 30–46 (2019)
Kanwal, S., Iqbal, Z., Al-Turjman, F., Irtaza, A., Khan, M.A.: Multiphase fault tolerance genetic algorithm for VM and task scheduling in datacenter. Inf. Process. Manag. 58(5), 102676 (2021)
Malik, M.K., Singh, A., Swaroop, A.: A planned scheduling process of cloud computing by an effective job allocation and fault-tolerant mechanism. J. Ambient Intell. Hum. Comput. 13, 1–19 (2022)
Heyang, X., Sen, X., Wei, W., Guo, N.: Fault tolerance and quality of service aware virtual machine scheduling algorithm in cloud data centers. J. Supercomput. 79(3), 2603–2625 (2023)
Marahatta, A., **n, Q., Chi, C., Zhang, F., Liu, Z.: PEFS: AI-driven prediction based energy-aware fault-tolerant scheduling scheme for cloud data center. IEEE Trans. Sustain. Comput. 6(4), 655–666 (2020)
Chen, J., Han, P., Liu, Y., **aoyan, D.: Scheduling independent tasks in cloud environment based on modified differential evolution. Concurr. Comput. 35(13), e6256 (2023)
Indhumathi, R., Amuthabala, K., Kiruthiga, G., Yuvaraj, N., Pandey, A.: Design of task scheduling and fault tolerance mechanism based on GWO algorithm for attaining better QoS in cloud system. Wirel. Personal Commun. 128(4), 2811–2829 (2023)
Nanjappan, M., Natesan, G., Krishnadoss, P.: HFTO: hybrid firebug tunicate optimizer for fault tolerance and dynamic task scheduling in cloud computing. Wirel. Personal. Commun. 129(1), 323–344 (2023)
Tamilvizhi, T., Parvathavarthini, B.: A novel method for adaptive fault tolerance during load balancing in cloud computing. Clust. Comput. 22(Suppl 5), 10425–10438 (2019)
Sheikh, S., Nagaraju, A., Shahid, M.: A fault-tolerant hybrid resource allocation model for dynamic computational grid. J. Comput. Sci. 48, 101268 (2021)
Chinnathambi, S., Santhanam, A., Rajarathinam, J., Senthilkumar, M.: Scheduling and checkpointing optimization algorithm for byzantine fault tolerance in cloud clusters. Clust. Comput. 22, 14637–14650 (2019)
Saidi, K., Bardou, D.: Task scheduling and VM placement to resource allocation in cloud computing: challenges and opportunities. Clust. Comput. 26(5), 3069–3087 (2023)
Haidri, R.A., Alam, M., Shahid, M., Prakash, S., Sajid, M.: A deadline aware load balancing strategy for cloud computing. Concurr. Comput. 34(1), e6496 (2022)
Yao, G., Ren, Q., Li, X., Zhao, S., Ruiz, R.: A hybrid fault-tolerant scheduling for deadline-constrained tasks in cloud systems. IEEE Trans. Serv. Comput. 15(3), 1371–1384 (2020)
Hussain, A., Aleem, M.: GOCJ: Google cloud jobs dataset for distributed and cloud computing infrastructures. Data 3(4), 38 (2018)
Reiss, C., Wilkes, J., Hellerstein, J.L.: Google cluster-usage traces: format+ schema. Google Inc. White Pap. 1, 1–14 (2011)
Kavulya, S., Tan, J., Gandhi, R., Narasimhan, P.: An analysis of traces from a production Mapreduce cluster. In: 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp. 94–103. IEEE (2010)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kirti, M., Maurya, A.K. & Yadav, R.S. Fault-tolerant allocation of deadline-constrained tasks through preemptive migration in heterogeneous cloud environments. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04538-9
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10586-024-04538-9