Skip to main content

and
  1. Chapter and Conference Paper

    Correction to: Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes

    Sebastian Kreutzer, Christian Iwainsky in High Performance Computing. ISC High Perfo… (2022)

  2. Chapter and Conference Paper

    Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes

    Code instrumentation is the primary method for collecting fine-grained performance data. As instrumentation introduces an inherent runtime overhead, it is essential to measure only those regions of the code wh...

    Sebastian Kreutzer, Christian Iwainsky in High Performance Computing. ISC High Perfo… (2022)

  3. No Access

    Chapter and Conference Paper

    Automatic Low-Overhead Load-Imbalance Detection in MPI Applications

    Load imbalances are a major reason for efficiency loss in highly parallel applications. Hence, their identification is of high relevance in performance analysis and tuning. We present a low-overhead approach t...

    Peter Arzt, Yannic Fischler, Jan-Patrick Lehr in Euro-Par 2021: Parallel Processing (2021)

  4. No Access

    Chapter and Conference Paper

    Compiler-Assisted Type-Safe Checkpointing

    TyCart is a tool for type-safe checkpoint/restart and extends the memory allocation sanitizer tool TypeART with type asserts. Type asserts let the developer specify type requirements on memory regions, and, in...

    Jan-Patrick Lehr, Alexander Hück, Moritz Fischer in High Performance Computing (2020)

  5. No Access

    Chapter and Conference Paper

    Automatic Detection of MPI Assertions

    The 2019 MPI standard draft specification includes the addition of defined communicator info hints. These hints are assertions that an application makes to an MPI implementation, so that a more optimized imple...

    Tim Jammer, Christian Iwainsky, Christian Bischof in High Performance Computing (2020)

  6. No Access

    Chapter and Conference Paper

    A Comparison of the Scalability of OpenMP Implementations

    OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be controlled and kept to a minimum to avoid low performance at scale. Previous work has shown that overheads do...

    Tim Jammer, Christian Iwainsky, Christian Bischof in Euro-Par 2020: Parallel Processing (2020)

  7. No Access

    Article

    The influence of two modern compiler infrastructures on the energy consumption of the HPCG benchmark

    As energy consumption plays a more and more critical role in high-performance computing installations, investigating the influence of the different system components and their share w.r.t. energy consumption i...

    Armin Jäger, Jan-Patrick Lehr in SICS Software-Intensive Cyber-Physical Sys… (2019)

  8. No Access

    Chapter and Conference Paper

    A Vectorized, Cache Efficient LLL Implementation

    This paper proposes a vectorized, cache efficient implementation of a floating-point version of the Lenstra-Lenstra-Lovász (LLL) algorithm, which is a key algorithm in many fields of computer science. We propo...

    Artur Mariano, Fábio Correia in High Performance Computing for Computation… (2017)

  9. Chapter and Conference Paper

    How Many Threads will be too Many? On the Scalability of OpenMP Implementations

    Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of nodes and the number of cores per node. OpenMP is a widely used standard for exploiting parallelism on the level ...

    Christian Iwainsky, Sergei Shudler in Euro-Par 2015: Parallel Processing (2015)

  10. Chapter and Conference Paper

    A Comprehensive Empirical Comparison of Parallel ListSieve and GaussSieve

    The security of lattice-based cryptosystems is determined by the performance of practical implementations of, among others, algorithms for the Shortest Vector Problem (SVP).

    Artur Mariano, Özgür Dagdelen in Euro-Par 2014: Parallel Processing Worksho… (2014)

  11. Chapter and Conference Paper

    Catwalk: A Quick Development Path for Performance Models

    Many parallel applications suffer from latent performance limitations that may prevent them from scaling to larger machine sizes. Often, such scalability bugs manifest themselves only when an attempt to scale ...

    Felix Wolf, Christian Bischof in Euro-Par 2014: Parallel Processing Worksho… (2014)

  12. No Access

    Article

    Brainware for green HPC

    The reduction of the infrastructural costs of HPC, in particular power consumption, currently is mainly driven by architectural advances in hardware. Recently, in the quest for the EFlop/s, hardware-software c...

    Christian Bischof, Dieter an Mey in Computer Science - Research and Development (2012)

  13. No Access

    Article

    Simulation of bevel gear cutting with GPGPUs—performance and productivity

    The desire for general purpose computation on graphics processing units caused the advance of new programming paradigms, e.g. OpenCL C/C++, CUDA C or the PGI Accelerator Model. In this paper, we apply these pr...

    Sandra Wienke, Dmytro Plotnikov in Computer Science - Research and Development (2011)

  14. No Access

    Chapter and Conference Paper

    Towards a Flexible and Distributed Simulation Platform

    This work is focused on bringing forth integrated simulation as a means to understand complex processes more thoroughly. Production processes that arise in the field of engineering sciences usually require exp...

    Philippe Cerfontaine, Thomas Beer in Computational Science and Its Applications… (2008)

  15. No Access

    Chapter and Conference Paper

    Hybrid Parallelization of CFD Applications with Dynamic Thread Balancing

    SMP Clusters with fat nodes offer an interesting capability for large applications that employ a hybrid parallelization model: to improve load balance, the number of threads can be increased in order to speed-...

    Alexander Spiegel, Dieter an Mey in Applied Parallel Computing. State of the A… (2006)

  16. No Access

    Chapter and Conference Paper

    Result-Verifying Solution of Nonlinear Systems in the Analysis of Chemical Processes

    A framework for the verified solution of nonlinear systems arising in the analysis and design of chemical processes is described. The framework combines a symbolic preprocessing step with an interval–based bra...

    Thomas Beelitz, Christian Bischof in Numerical Software with Result Verification (2004)