Search Page | SpringerLink

Resource allocation and aging priority-based scheduling of linear workflow applications with transient failures and selective imprecise computations

A wide range of applications in distributed environments have a linear structure, varying priorities, and may experience transient software failures....

Helen D. Karatza, Georgios L. Stavrinides in Cluster Computing

Article 31 January 2024

A Model of Actors and Grey Failures

Existing models for the analysis of concurrent processes tend to focus on fail-stop failures, where processes are either working or permanently...

Laura Bocchi, Julien Lange, ... A. Laura Voinea in Coordination Models and Languages

Conference paper 2022

Characterizing Memory Failures Using Benford’s Law

Fault tolerance is a key challenge as high performance computing systems continue to increase component counts, individual component reliability...

Kurt B. Ferreira, Scott Levy in Euro-Par 2021: Parallel Processing Workshops

Conference paper 2022

Interpreting the vulnerability of power systems in cascading failures using multi-graph convolutional networks

Analyzing the vulnerability of power systems in cascading failures is generally regarded as a challenging problem. Although existing studies can...

Supaporn Lonapalawong, Changsheng Chen, ... Wei Chen in Frontiers of Information Technology & Electronic Engineering

Article 20 June 2022

Disconnected Agreement in Networks Prone to Link Failures

We consider deterministic distributed algorithms for reaching agreement in synchronous networks of arbitrary topologies. Links are bi-directional and...

Bogdan S. Chlebus, Dariusz R. Kowalski, ... Jędrzej Olkowski in Stabilization, Safety, and Security of Distributed Systems

Conference paper 2023

Exploring the Impact of Node Failures on the Resource Allocation for Parallel Jobs

Increasing the size and complexity of modern HPC systems also increases the probability of various types of failures. Failures may disrupt...

Ioannis Vardas, Manolis Ploumidis, Manolis Marazakis in Euro-Par 2021: Parallel Processing Workshops

Conference paper 2022

Bayesian network model to distinguish between intentional attacks and accidental technical failures: a case study of floodgates

Water management infrastructures such as floodgates are critical and increasingly operated by Industrial Control Systems (ICS). These systems are...

Sabarathinam Chockalingam, Wolter Pieters, ... Pieter van Gelder in Cybersecurity

Article Open access 01 September 2021

Modeling and adaptive control for a spatial flexible spacecraft with unknown actuator failures

In this paper, we address simultaneous control of a flexible spacecraft’s attitude and vibrations in a three-dimensional space under input...

Zhijie Liu, Zhiji Han, ... Wei He in Science China Information Sciences

Article 08 April 2021

Power System Transient Stability Prediction in the Face of Cyber Attacks: Employing LSTM-AE to Combat Falsified PMU Data

Phasor measurement units (PMUs) are essential instruments in delivering real-time data crucial for monitoring the dynamics of power systems. They are...

Benyamin Jafari, Mehmet Akif Yazici in Dependable Computing – EDCC 2024 Workshops

Conference paper 2024

The Pathology of Failures in IoT Systems

The presence of faults is inevitable in the Internet of Things (IoT) systems. Dependability in these systems is challenging due to the increasing...

Mário Melo, Gibeon Aquino in Computational Science and Its Applications – ICCSA 2021

Conference paper 2021

Integrating request replication into FaaS platforms: an experimental evaluation

Function-as-a-Service (FaaS) is a popular programming model for building serverless applications, supported by all major cloud providers and many...

Yasmina Bouizem, Djawida Dib, ... Christine Morin in Journal of Cloud Computing

Article Open access 22 June 2023

Consensus in anonymous asynchronous systems with crash-recovery and omission failures

In anonymous distributed systems, processes are indistinguishable because they have no identity and execute the same algorithm. Currently, anonymous...

Ernesto Jiménez, José Luis López-Presa, Marta Patiño-Martínez in Computing

Article Open access 08 October 2021

Asynchronous Consensus in Synchronous Systems Using send_to_all Primitive

Consensus is a fundamental agreement problem that arises when a set of distributed processes has to decide on a common value among their respective...

Sathyanarayanan Srinivasan, Kandukoori Ramesh in SN Computer Science

Article 12 October 2023

Failures Forecast in Monitoring Datacenter Infrastructure Through Machine Learning Techniques: A Systematic Review

With the trend of accelerating digital transformation processes, datacenters (DC) are gaining prominence as increasingly critical components for...

Walter Lopes Neto, Itamir de Morais Barroca Filho in Computational Science and Its Applications – ICCSA 2021

Conference paper 2021

Transient Analysis of Hierarchical Semi-Markov Process Models with Tool Support in Stateflow

Semi-Markov process (SMP) models can not always accurately model real-world systems. To help the situation the paper proposes an hierarchical...

Stefan Kaalen, Mattias Nyberg, Olle Mattsson in Quantitative Evaluation of Systems

Conference paper 2021

Modelling of Software Failures

Software is crucial in the provision of communication services. Most functions related to control, management and operation are realized in software....

Bjarne E. Helvik, Petra Vizarreta, ... Carmen Mas-Machuca in Guide to Disaster-Resilient Communication Networks

Chapter 2020

\(\mu \) Chaos: Moving Chaos Engineering to IoT Devices

The concept of the Internet of Things (IoT) has been widely used in many applications. IoT devices can be exposed to various external factors, such...

Wojciech Kalka, Tomasz Szydlo in Computational Science – ICCS 2024

Conference paper 2024

Simulation Experiments of a Distributed Fault Containment Algorithm Using Randomized Scheduler

Fault containment is a critical component of stabilizing distributed systems. A distributed system is termed stabilizing (or self-stabilizing) if it...

Anurag Dasgupta, David Tan, Koushik Majumder in Advanced Communication and Intelligent Systems

Conference paper 2023