Skip to main content

and
  1. No Access

    Article

    The evolution of distributed computing systems: from fundamental to new frontiers

    Distributed systems have been an active field of research for over 60 years, and has played a crucial role in computer science, enabling the invention of the Internet that underpins all facets of modern life. ...

    Dominic Lindsay, Sukhpal Singh Gill, Daria Smirnova, Peter Garraghan in Computing (2021)

  2. No Access

    Article

    Tails in the cloud: a survey and taxonomy of straggler management within large-scale cloud data centres

    Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scal...

    Sukhpal Singh Gill, Xue Ouyang, Peter Garraghan in The Journal of Supercomputing (2020)

  3. No Access

    Chapter and Conference Paper

    Horus: An Interference-Aware Resource Manager for Deep Learning Systems

    Deep Learning (DL) models are deployed as jobs within machines containing GPUs. These DL systems - ranging from a singular GPU device to machine clusters - require state-of-the-art resource management to incre...

    Gingfung Yeung, Damian Borowiec, Renyu Yang in Algorithms and Architectures for Parallel … (2020)