![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Chapter and Conference Paper
Correction to: Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes
-
Chapter and Conference Paper
Compiler-Assisted Instrumentation Selection for Large-Scale C++ Codes
Code instrumentation is the primary method for collecting fine-grained performance data. As instrumentation introduces an inherent runtime overhead, it is essential to measure only those regions of the code wh...
-
Chapter and Conference Paper
Automatic Low-Overhead Load-Imbalance Detection in MPI Applications
Load imbalances are a major reason for efficiency loss in highly parallel applications. Hence, their identification is of high relevance in performance analysis and tuning. We present a low-overhead approach t...
-
Chapter and Conference Paper
Compiler-Assisted Type-Safe Checkpointing
TyCart is a tool for type-safe checkpoint/restart and extends the memory allocation sanitizer tool TypeART with type asserts. Type asserts let the developer specify type requirements on memory regions, and, in...
-
Chapter and Conference Paper
Automatic Detection of MPI Assertions
The 2019 MPI standard draft specification includes the addition of defined communicator info hints. These hints are assertions that an application makes to an MPI implementation, so that a more optimized imple...
-
Chapter and Conference Paper
A Comparison of the Scalability of OpenMP Implementations
OpenMP implementations must exploit current and upcoming hardware for performance. Overhead must be controlled and kept to a minimum to avoid low performance at scale. Previous work has shown that overheads do...
-
Article
The influence of two modern compiler infrastructures on the energy consumption of the HPCG benchmark
As energy consumption plays a more and more critical role in high-performance computing installations, investigating the influence of the different system components and their share w.r.t. energy consumption i...
-
Chapter and Conference Paper
A Vectorized, Cache Efficient LLL Implementation
This paper proposes a vectorized, cache efficient implementation of a floating-point version of the Lenstra-Lenstra-Lovász (LLL) algorithm, which is a key algorithm in many fields of computer science. We propo...
-
Chapter and Conference Paper
How Many Threads will be too Many? On the Scalability of OpenMP Implementations
Exascale systems will exhibit much higher degrees of parallelism both in terms of the number of nodes and the number of cores per node. OpenMP is a widely used standard for exploiting parallelism on the level ...
-
Chapter and Conference Paper
A Comprehensive Empirical Comparison of Parallel ListSieve and GaussSieve
The security of lattice-based cryptosystems is determined by the performance of practical implementations of, among others, algorithms for the Shortest Vector Problem (SVP).
-
Chapter and Conference Paper
Catwalk: A Quick Development Path for Performance Models
Many parallel applications suffer from latent performance limitations that may prevent them from scaling to larger machine sizes. Often, such scalability bugs manifest themselves only when an attempt to scale ...
-
Article
Brainware for green HPC
The reduction of the infrastructural costs of HPC, in particular power consumption, currently is mainly driven by architectural advances in hardware. Recently, in the quest for the EFlop/s, hardware-software c...
-
Article
Simulation of bevel gear cutting with GPGPUs—performance and productivity
The desire for general purpose computation on graphics processing units caused the advance of new programming paradigms, e.g. OpenCL C/C++, CUDA C or the PGI Accelerator Model. In this paper, we apply these pr...
-
Chapter and Conference Paper
Towards a Flexible and Distributed Simulation Platform
This work is focused on bringing forth integrated simulation as a means to understand complex processes more thoroughly. Production processes that arise in the field of engineering sciences usually require exp...
-
Chapter and Conference Paper
Hybrid Parallelization of CFD Applications with Dynamic Thread Balancing
SMP Clusters with fat nodes offer an interesting capability for large applications that employ a hybrid parallelization model: to improve load balance, the number of threads can be increased in order to speed-...
-
Chapter and Conference Paper
Result-Verifying Solution of Nonlinear Systems in the Analysis of Chemical Processes
A framework for the verified solution of nonlinear systems arising in the analysis and design of chemical processes is described. The framework combines a symbolic preprocessing step with an interval–based bra...