Skip to main content

and
  1. No Access

    Chapter and Conference Paper

    Observed Memory Bandwidth and Power Usage on FPGA Platforms with OneAPI and Vitis HLS: A Comparison with GPUs

    The two largest barriers to adoption of FPGA platforms for HPC applications are the difficulty of programming FPGAs and the performance gap when compared to GPUs. To address the first barrier, new ecosystems l...

    Christopher M. Siefert, Stephen L. Olivier in High Performance Computing (2023)

  2. No Access

    Chapter and Conference Paper

    The Kokkos OpenMPTarget Backend: Implementation and Lessons Learned

    As the supercomputing landscape diversifies, solutions such as Kokkos to write vendor agnostic applications and libraries have risen in popularity. Kokkos provides a programming model designed for performance ...

    Rahulkumar Gayatri, Stephen L. Olivier in OpenMP: Advanced Task-Based, Device and Co… (2023)

  3. No Access

    Chapter and Conference Paper

    Characterizing the Performance of Task Reductions in OpenMP 5.X Implementations

    OpenMP 5.0 added support for reductions over explicit tasks. This expands the previous reduction support that was limited primarily to worksharing and parallel constructs. While the scope of a reduction operat...

    Jan Ciesko, Stephen L. Olivier in OpenMP in a Modern World: From Multi-devic… (2022)

  4. No Access

    Chapter and Conference Paper

    ALAMO: Autonomous Lightweight Allocation, Management, and Optimization

    Several recent workshops conducted by the DOE Advanced Scientific Computing Research program have established the fact that the complexity of develo** applications and executing them on high-performance comp...

    Ron Brightwell, Kurt B. Ferreira in Driving Scientific and Engineering Discove… (2020)

  5. No Access

    Chapter and Conference Paper

    Evaluating the Efficiency of OpenMP Tasking for Unbalanced Computation on Diverse CPU Architectures

    In the decade since support for task parallelism was incorporated into OpenMP, its use has remained limited in part due to concerns about its performance and scalability. This paper revisits a study from the e...

    Stephen L. Olivier in OpenMP: Portable Multi-Level Parallelism on Modern Systems (2020)

  6. No Access

    Chapter and Conference Paper

    Making OpenMP Ready for C++ Executors

    For at least the last 20 years, many have tried to create a general resource management system to support interoperability across various concurrent libraries. The previous strategies all suffered from additio...

    Thomas R. W. Scogland, Dan Sunderland in OpenMP: Conquering the Full Hardware Spect… (2019)

  7. No Access

    Chapter and Conference Paper

    Cactus Environment Machine

    Existing machines for lazy evaluation use a flat representation of environments, storing the terms associated with free variables in an array. Combined with a heap, this structure supports the shared intermediate...

    George Stelle, Darko Stefanovic, Stephen L. Olivier in Trends in Functional Programming (2019)

  8. No Access

    Chapter and Conference Paper

    Assessing Task-to-Data Affinity in the LLVM OpenMP Runtime

    In modern shared-memory NUMA systems which typically consist of two or more multi-core processor packages with local memory, affinity of data to computation is crucial for achieving high performance with an Op...

    Jannis Klinkenberg, Philipp Samfass in Evolving OpenMP for Evolving Architectures (2018)

  9. No Access

    Book and Conference Proceedings

    Scaling OpenMP for Exascale Performance and Portability

    13th International Workshop on OpenMP, IWOMP 2017, Stony Brook, NY, USA, September 20–22, 2017, Proceedings

    Bronis R. de Supinski in Lecture Notes in Computer Science (2017)

  10. No Access

    Chapter and Conference Paper

    Double Buffering for MCDRAM on Second Generation \(\hbox {Intel}^{\circledR }\) Xeon Phi \(^{\text {TM}}\) Processors with OpenMP

    Emerging novel architectures for shared memory parallel computing are incorporating increasingly creative innovations to deliver higher memory performance. A notable exemplar of this phenomenon is the Multi-Ch...

    Stephen L. Olivier, Simon D. Hammond in Scaling OpenMP for Exascale Performance an… (2017)

  11. No Access

    Chapter and Conference Paper

    Approaches for Task Affinity in OpenMP

    OpenMP tasking supports parallelization of irregular algorithms. Recent OpenMP specifications extended tasking to increase functionality and to support optimizations, for instance with the taskloop construct. How...

    Christian Terboven, Jonas Hahnfeld, Xavier Teruel in OpenMP: Memory, Devices, and Tasks (2016)

  12. No Access

    Book and Conference Proceedings

    Using and Improving OpenMP for Devices, Tasks, and More

    10th International Workshop on OpenMP, IWOMP 2014, Salvador, Brazil, September 28-30, 2014. Proceedings

    Luiz DeRose, Bronis R. de Supinski in Lecture Notes in Computer Science (2014)

  13. No Access

    Chapter and Conference Paper

    A Proposal for Task-Generating Loops in OpenMP*

    With the addition of the OpenMP* tasking model, programmers are able to improve and extend the parallelization opportunities of their codes. Programmers can also distribute the creation of tasks using a worksh...

    Xavier Teruel, Michael Klemm, Kelvin Li in OpenMP in the Era of Low Power Devices and… (2013)

  14. No Access

    Article

    Comparison of OpenMP 3.0 and Other Task Parallel Frameworks on Unbalanced Task Graphs

    The UTS benchmark is used to evaluate the expression and performance of task parallelism in OpenMP 3.0 as implemented in a number of recently released compilers and run-time systems. UTS performs parallel sear...

    Stephen L. Olivier, Jan F. Prins in International Journal of Parallel Programming (2010)

  15. No Access

    Chapter and Conference Paper

    Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs

    The UTS benchmark is used to evaluate task parallelism in OpenMP 3.0 as implemented in a number of recently released compilers and run-time systems. UTS performs parallel search of an irregular and unpredictab...

    Stephen L. Olivier, Jan F. Prins in Evolving OpenMP in an Age of Extreme Parallelism (2009)