Skip to main content

previous disabled Page of 5
and
  1. Article

    Open Access

    Design and performance evaluation of UCX for the Tofu Interconnect D on Fugaku towards efficient multithreaded communication

    The increasing trend of manycore processors makes multithreaded communication more important to avoid costly global synchronization among cores. One of the representative approaches that require multithreaded ...

    Yutaka Watanabe, Miwako Tsuji, Hitoshi Murai, Taisuke Boku in The Journal of Supercomputing (2024)

  2. No Access

    Chapter and Conference Paper

    Enhancing the Parallel UC2B Framework: Approach Validation and Scalability Study

    Anomaly detection is a critical aspect of uncovering unusual patterns in data analysis. This involves distinguishing between normal patterns and abnormal ones, which inherently involves uncertainty. This paper...

    Zineb Ziani, Nahid Emad, Miwako Tsuji, Mitsuhisa Sato in Computational Science – ICCS 2024 (2024)

  3. No Access

    Chapter and Conference Paper

    OpenACC Unified Programming Environment for Multi-hybrid Acceleration with GPU and FPGA

    Accelerated computing in HPC such as with GPU, plays a central role in HPC nowadays. However, in some complicated applications with partially different performance behavior is hard to solve with a single type ...

    Taisuke Boku, Ryuta Tsunashima, Ryohei Kobayashi in High Performance Computing (2023)

  4. No Access

    Chapter and Conference Paper

    Scaling the PageRank Algorithm for Very Large Graphs on the Fugaku Supercomputer

    The PageRank algorithm is a widely used linear algebra method with many applications. As graphs with billions or more of nodes become increasingly common, being able to scale this algorithm on modern HPC arch...

    Maxence Vandromme, Jérôme Gurhem, Miwako Tsuji in Computational Science – ICCS 2022 (2022)

  5. Article

    Open Access

    A new sustained system performance metric for scientific performance evaluation

    Because of the increasing complexities of systems and applications, the performance of many traditional HPC benchmarks, such as HPL or HPCG, no longer correlates strongly with the actual performance of real ap...

    Miwako Tsuji, William T. C. Kramer, Jean-Christophe Weill in The Journal of Supercomputing (2021)

  6. No Access

    Article

    Performance and power consumption analysis of Arm Scalable Vector Extension

    Modern CPUs not only have multiple cores but also support wide single instruction multiple data (SIMD). This trend is expected to grow in the future. In this paper, we examine the effect of the vector length ...

    Tetsuya Odajima, Yuetsu Kodama, Mitsuhisa Sato in The Journal of Supercomputing (2021)

  7. Chapter and Conference Paper

    Correction to: Performance of the Supercomputer Fugaku for Breadth-First Search in Graph500 Benchmark

    Masahiro Nakao, Koji Ueno, Katsuki Fujisawa, Yuetsu Kodama in High Performance Computing (2021)

  8. Chapter

    Multi-SPMD Programming Model with YML and XcalableMP

    This chapter describes a multi-SPMD (mSPMD) programming model and a set of software and libraries to support the mSPMD programming model. The mSPMD programming model has been proposed to realize scalable appli...

    Miwako Tsuji, Hitoshi Murai, Taisuke Boku in XcalableMP PGAS Programming Language (2021)

  9. Chapter

    XcalableMP 2.0 and Future Directions

    This chapter presents the XcalableMP on the Fugaku supercomputer, the Japanese flagship supercomputer developed by FLAGSHIP2020 project in RIKEN R-CCS. The porting and the performance evaluation were done as a...

    Mitsuhisa Sato, Hitoshi Murai, Masahiro Nakao in XcalableMP PGAS Programming Language (2021)

  10. Chapter

    XcalableMP Programming Model and Language

    XcalableMP (XMP) is a directive-based language extension of Fortran and C for distributed-memory parallel computers, and can be classified as a partitioned global address space (PGAS) language. One of the rema...

    Hitoshi Murai, Masahiro Nakao, Mitsuhisa Sato in XcalableMP PGAS Programming Language (2021)

  11. Chapter

    Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP

    XcalableMP(XMP) supports a global-view model that allows programmers to define global data and to map them to a set of processors, which execute the distributed global data as a single thread. In XMP, the conc...

    Keisuke Tsugane, Taisuke Boku, Hitoshi Murai in XcalableMP PGAS Programming Language (2021)

  12. No Access

    Chapter and Conference Paper

    Performance of the Supercomputer Fugaku for Breadth-First Search in Graph500 Benchmark

    In this paper, we present the performance of the supercomputer Fugaku for breadth-first search (BFS) problem in the Graph500 benchmark, which is known as a ranking benchmark used to evaluate large-scale graph ...

    Masahiro Nakao, Koji Ueno, Katsuki Fujisawa, Yuetsu Kodama in High Performance Computing (2021)

  13. No Access

    Article

    InKS: a programming model to decouple algorithm from optimization in HPC codes

    Existing programming models tend to tightly interleave algorithm and optimization in HPC simulation codes. This requires scientists to become experts in both the simulated domain and the optimization process a...

    Ksander Ejjaaouani, Olivier Aumage, Julien Bigot in The Journal of Supercomputing (2020)

  14. No Access

    Book

    Advanced Software Technologies for Post-Peta Scale Computing

    The Japanese Post-Peta CREST Research Project

    (2019)

  15. No Access

    Chapter and Conference Paper

    OpenMP Task Generation for Batched Kernel APIs

    The demand for calculating many small computation kernels is getting significantly important in the HPC area not only for the traditional numerical applications but also recent machine learning applications. W...

    **pil Lee, Yutaka Watanabe in OpenMP: Conquering the Full Hardware Spectrum (2019)

  16. No Access

    Chapter

    JST CREST Post-petascale Software Project Bridging to Exascale Computing

    JST CREST post-petascale software project aimed to establish software technologies to explore extreme performance computing beyond petascale computing, on the road to exascale computing. Several research and d...

    Mitsuhisa Sato in Advanced Software Technologies for Post-Peta Scale Computing (2019)

  17. No Access

    Chapter

    SCore

    SCore is a   package for high-   performance clusters. It includes a low-   communication layer named PM(v2), a user-level, global operating system called SCore-D, an MPI   , an  compiler that enables  prog...

    Atsushi Hori, Hiroshi Tezuka in Operating Systems for Supercomputers and H… (2019)

  18. Chapter and Conference Paper

    \(\textsc {InKS}_{\textsf {}}\) , a Programming Model to Decouple Performance from Algorithm in HPC Codes

    Existing programming models tend to tightly interleave algorithm and optimization in HPC simulation codes. This requires scientists to become experts in both the simulated domain and the optimization process a...

    Ksander Ejjaaouani, Olivier Aumage in Euro-Par 2018: Parallel Processing Worksho… (2019)

  19. No Access

    Chapter and Conference Paper

    Trade-Off of Offloading to FPGA in OpenMP Task-Based Programming

    In High-Performance Computing (HPC), Field Programmable Gate Array (FPGA) is attracting increased attention as an accelerator because its performance has been dramatically improved in recent years. On the othe...

    Yutaka Watanabe, **pil Lee, Taisuke Boku in Evolving OpenMP for Evolving Architectures (2018)

  20. No Access

    Chapter and Conference Paper

    The Impact of Taskyield on the Design of Tasks Communicating Through MPI

    The OpenMP tasking directives promise to help expose a higher degree of concurrency to the runtime than traditional worksharing constructs, which is especially useful for irregular applications. In combination...

    Joseph Schuchart, Keisuke Tsugane in Evolving OpenMP for Evolving Architectures (2018)

previous disabled Page of 5