Improving fault tolerance in LinuX container-based distributed systems using blockchain

Farahmandian, Masoum; Foumani, Mehdi Farrokhbakht; Bayat, Peyman

doi:10.1007/s10586-024-04279-9

Improving fault tolerance in LinuX container-based distributed systems using blockchain

Published: 19 January 2024

(2024)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Masoum Farahmandian¹,
Mehdi Farrokhbakht Foumani² &
Peyman Bayat¹

156 Accesses
Explore all metrics

Abstract

Nowadays, with the development of science and technology, as well as the increase of important data and transactions, maintaining these data and transactions has become a big challenge. On the one hand, their maintenance cost is a very important issue for organizations and companies, and on the other hand, their security and safety is a very important and sensitive issue because the occurrence of software faults, especially Byzantine faults, hardware faults and cyber-attacks, threaten data and transactions and the safety of systems. Therefore, researchers are trying to provide solutions that can provide the best service at the lowest cost according to the pay-as-you-go law and can maintain the security and health of data in the event of a fault. One of the most important techniques presented to increase fault tolerance in distributed systems is the use of replication methods, which besides being costly, have many problems. In this article, blockchain technology is used to achieve goals such as increasing reliability and availability, reducing resources, reducing costs, and increasing fault tolerance, especially Byzantine faults, and has achieved very good results compared to other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Blockchain smart contracts: Applications, challenges, and future trends

Article 18 April 2021

Blockchain for healthcare data management: opportunities, challenges, and future recommendations

Article 07 January 2021

Blockchain for decentralization of internet: prospects, trends, and challenges

Article Open access 15 May 2021

References

Mallisetty, S.B., et al.: A Review on Cloud Security and Its Challenges. in 2023 International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT). IEEE. (2023)
Butt, U.A., et al.: Cloud security threats and solutions: A survey. Wireless Pers. Commun. 128(1), 387–413 (2023)
Article Google Scholar
Asadova, F., et al.: A Survey of Usage of Anytime Algorithm in Fault detection in Cloud Systems. in 2023 IEEE 21st World Symposium on Applied Machine Intelligence and Informatics (SAMI). IEEE. (2023)
Liakath, J.A., Krishnadoss, P., Natesan, G.: DCCWOA: A multi-heuristic fault tolerant scheduling technique for cloud computing environment. Peer-to-Peer Netw. Appl., : p. 1–18. (2023)
Schlögl, T., Schmid, U.: A Sufficient Condition for Gaining Belief in Byzantine Fault-Tolerant Distributed Systems. ar**v preprint ar**v:2304.00389, (2023)
Hao, X., et al.: Dynamic practical byzantine fault tolerance. in. IEEE conference on communications and network security (CNS). 2018. IEEE. (2018)
Reghenzani, F., Guo, Z., Fornaciari, W.: Software Fault Tolerance in real-time Systems: Identifying the Future Research Questions. ACM Computing Surveys (2023)
Abeni, L., et al.: Fault tolerance in real-time cloud computing. in 2023 IEEE 26th International Symposium on Real-Time Distributed Computing (ISORC). IEEE. (2023)
Bakhshi, Z., Rodriguez-Navas, G., Hansson, H.: Fault-tolerant permanent storage for container-based fog architectures. in 2021 22nd IEEE International Conference on Industrial Technology (ICIT). IEEE. (2021)
Diouf, G.M., Elbiaze, H., Jaafar, W.: On byzantine fault tolerance in multi-master kubernetes clusters. Future Generation Computer Systems. 109, 407–419 (2020)
Article Google Scholar
Jayasekara, S., Karunasekera, S., Harwood, A.: Optimizing checkpoint-based fault‐tolerance in distributed stream processing systems: Theory to practice. Software: Pract. Experience. 52(1), 296–315 (2022)
Google Scholar
Zhou, D., Tamir, Y.: Hycor: Fault-tolerant replicated containers based on checkpoint and replay. ar**v preprint ar**v:2101.09584, (2021)
Marcotte, P., Grégoire, F., Petrillo, F.: Multiple fault-tolerance mechanisms in cloud systems: A systematic review. in 2019 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). IEEE. (2019)
Mousavi Nik, S.S., Naghibzadeh, M., Sedaghat, Y.: Task replication to improve the reliability of running workflows on the cloud. Cluster Comput. 24, 343–359 (2021)
Article Google Scholar
Mesbahi, M.R., Rahmani, A.M., Hosseinzadeh, M.: Reliability and high availability in cloud computing environments: A reference roadmap. Human-centric Comput. Inform. Sci. 8, 1–31 (2018)
Google Scholar
Pandey, T.K., Singh, I., Kumar, M.: Replication in distributed systems and its improvements. Int. J. Curr. Microbiol. App Sci. 8(5), 446–451 (2019)
Article Google Scholar
Shakarami, A., et al.: Data replication schemes in cloud computing: A survey. Cluster Comput. 24, 2545–2579 (2021)
Article Google Scholar
Slimani, S., Hamrouni, T., Ben Charrada, F.: Service-oriented replication strategies for improving quality-of-service in cloud computing: A survey. Cluster Comput. 24, 361–392 (2021)
Article Google Scholar
Chandrakala, H., Loganathan, R.: Efficient heuristic replication techniques for High Data availability in Cloud. Comput. Syst. Sci. Eng., 45(3). (2023)
Rajalakshmi, K., Sambath, M., Joseph, L.: Research Challenges and Future Directions for Data Storage in Cloud Computing Environment. in 2023 International Conference on Computer Communication and Informatics (ICCCI). IEEE. (2023)
Marcozzi, M., et al.: Availability Model for Byzantine Fault-Tolerant Systems. in International Conference on Advanced Information Networking and Applications. Springer. (2023)
Kit, N.K.K., Aibin, M.: Study on High Availability and Fault Tolerance. in 2023 International Conference on Computing, Networking and Communications (ICNC). IEEE. (2023)
Paul, J.J.: Disaster Recovery Architectures, in Distributed Serverless Architectures on AWS: Design and Implement Serverless Architectures, pp. 49–73. Springer (2023)
Ezechiel, K.K., Agarwal, R., Kaushik, B.: Synchronous and asynchronous replication. (2017)
Ghosh, R.K., Ghosh, H.: Distributed Systems: Theory and Applications. John Wiley & Sons (2023)
Altaf, A., et al.: A survey of blockchain technology: Architecture, applied domains, platforms, and security threats. Social Sci. Comput. Rev. 41(5), 1941–1962 (2023)
Article Google Scholar
Sheth, H., Dattani, J.: Overview of blockchain technology. Asian Journal For Convergence In Technology (AJCT) ISSN-2350-1146, (2019)
Arias Maestro, A., et al.: Blockchain based cloud management architecture for maximum availability. (2023)
Talaver, V., Vakaliuk, T.A.: Reliable distributed systems: Review of modern approaches. J. Edge Comput. 2(1), 84–101 (2023)
Article Google Scholar
Zheng, Z., et al.: An Overview of Blockchain Technology: Architecture, Consensus, and Future Trends. In 2017 IEEE International Congress on big data (BigData Congress). Ieee (2017)
Guo, H., Yu, X.: A Survey on Blockchain Technology and its security. Blockchain: Res. Appl. 3(2), 100067 (2022)
Google Scholar
Sampaio, A.M., Barbosa, J.G.: A comparative cost analysis of fault-tolerance mechanisms for availability on the cloud. Sustainable Computing: Informatics and Systems. 19, 315–323 (2018)
Google Scholar
Louati, T., Abbes, H., Cérin, C.: LXCloudFT: Towards high availability, fault tolerant cloud system based Linux containers. J. Parallel Distrib. Comput. 122, 51–69 (2018)
Article Google Scholar
Louati, T., et al.: Lxcloud-cr: Towards linux containers distributed hash table based checkpoint-restart. J. Parallel Distrib. Comput. 111, 187–205 (2018)
Article Google Scholar
Louati, T., et al.: Gc-cr: a decentralized garbage collector component for checkpointing in clouds. in 2017 29th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD). IEEE. (2017)
Nasibullin, A.R., Novikov, B.A.: Replication in distributed systems: Models, methods, and protocols. Program. Comput. Softw. 46, 341–350 (2020)
Article MathSciNet Google Scholar
Nussbaum, L.: Usages et utilisateurs de Grid’5000: stratégie pour l’accès aux ressources. (2016)
Abbes, H., Louati, T., Cérin, C.: Dynamic replication factor model for Linux containers-based cloud systems. J. Supercomputing. 76, 7219–7241 (2020)
Article Google Scholar
Chakraborty, S., Islam, S.H., Samanta, D.: Introduction to Data Mining and Knowledge Discovery, in Data Classification and Incremental Clustering in Data Mining and Machine Learning, pp. 1–22. Springer (2022)
Semmoud, A., et al.: A New Fault-Tolerant Algorithm based on replication and preemptive Migration in Cloud Computing. Int. J. Cloud Appl. Comput. (IJCAC). 12(1), 1–14 (2022)
Google Scholar
Alimjon, D.: Problems of data replication in distribution systems. ACADEMICIA: An. International Multidisciplinary Research Journal. 12(5), 1119–1128 (2022)
Google Scholar
Chen, B., Jiang, Z.M.: A survey of software log instrumentation. ACM Comput. Surv. (CSUR). 54(4), 1–34 (2021)
Google Scholar
Pecchia, A., et al.: Industry practices and event logging: Assessment of a critical software development process. in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering. IEEE. (2015)

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran
Masoum Farahmandian & Peyman Bayat
Department of Computer Engineering, Fouman and Shaft Branch, Islamic Azad University, Fouman, Iran
Mehdi Farrokhbakht Foumani

Authors

Masoum Farahmandian
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Farrokhbakht Foumani
View author publications
You can also search for this author in PubMed Google Scholar
Peyman Bayat
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Peyman Bayat, Mehdi Farrokhbakht Foumani and Masoum Farahmandian. The first draft of the manuscript was written by Masoum Farahmandian and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.Idea of the article: Peyman BayatLiterature search and data analysis: Masoum FarahmandianCritical review of the work: Peyman Bayat, Mehdi Farrokhbakht FoumaniDraft of the work: Masoum Farahmandian.

Corresponding author

Correspondence to Mehdi Farrokhbakht Foumani.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Farahmandian, M., Foumani, M.F. & Bayat, P. Improving fault tolerance in LinuX container-based distributed systems using blockchain. Cluster Comput (2024). https://doi.org/10.1007/s10586-024-04279-9

Download citation

Received: 26 October 2023
Revised: 10 December 2023
Accepted: 04 January 2024
Published: 19 January 2024
DOI: https://doi.org/10.1007/s10586-024-04279-9

Keywords

Access this article

Log in via an institution

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Improving fault tolerance in LinuX container-based distributed systems using blockchain

Abstract

Access this article

Similar content being viewed by others

Blockchain smart contracts: Applications, challenges, and future trends

Blockchain for healthcare data management: opportunities, challenges, and future recommendations

Blockchain for decentralization of internet: prospects, trends, and challenges

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving fault tolerance in LinuX container-based distributed systems using blockchain

Abstract

Access this article

Similar content being viewed by others

Blockchain smart contracts: Applications, challenges, and future trends

Blockchain for healthcare data management: opportunities, challenges, and future recommendations

Blockchain for decentralization of internet: prospects, trends, and challenges

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation