![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Article
The evolution of distributed computing systems: from fundamental to new frontiers
Distributed systems have been an active field of research for over 60 years, and has played a crucial role in computer science, enabling the invention of the Internet that underpins all facets of modern life. ...
-
Article
Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres
Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scal...
-
Chapter and Conference Paper
Horus: An Interference-Aware Resource Manager for Deep Learning Systems
Deep Learning (DL) models are deployed as jobs within machines containing GPUs. These DL systems - ranging from a singular GPU device to machine clusters - require state-of-the-art resource management to incre...