Skip to main content

and
  1. No Access

    Chapter and Conference Paper

    File I/O Cache Performance of Supercomputer Fugaku Using an Out-of-Core Direct Numerical Simulation Code of Turbulence

    Turbulent flows play important roles in many flow-related phenomena that appear in various fields. However, despite numerous studies on turbulence, the nature of turbulence has not yet been fully clarified. Di...

    Yuto Hatanaka, Yuki Yamane, Kenta Yamaguchi in Computational Science – ICCS 2024 (2024)

  2. No Access

    Chapter and Conference Paper

    Analysis of Precision Vectors for Ising-Based Linear Regression

    Quantum computing has been much attention as one of the new computational principles. In particular, annealing machines that use the Ising model of statistical mechanics are emerging and feasible next-generati...

    Kaho Aoyama, Kazuhiko Komatsu in Parallel and Distributed Computing, Applic… (2023)

  3. No Access

    Chapter and Conference Paper

    A Partitioned Memory Architecture with Prefetching for Efficient Video Encoders

    A hardware video encoder based on recent video coding standards such as HEVC and VVC needs to efficiently handle a massive number of memory accesses to search motion vectors. To this end, first, this paper pre...

    Masayuki Sato, Yuya Omori, Ryusuke Egawa in Parallel and Distributed Computing, Applic… (2023)

  4. No Access

    Article

    VGL: a high-performance graph processing framework for the NEC SX-Aurora TSUBASA vector architecture

    Develo** efficient graph algorithms implementations is an extremely important problem of modern computer science, since graphs are frequently used in various real-world applications. Graph algorithms typical...

    Ilya V. Afanasyev, Vladimir V. Voevodin, Kazuhiko Komatsu in The Journal of Supercomputing (2021)

  5. No Access

    Chapter and Conference Paper

    A Deep Reinforcement Learning Based Feature Selector

    In the field of data mining and machine learning, it is a challenge for researchers and engineers to analyze and classify the high-dimensional data. In order to minimize the classification error, it is critica...

    Yiran Cheng, Kazuhiko Komatsu, Masayuki Sato in Parallel Architectures, Algorithms and Pro… (2021)

  6. No Access

    Chapter and Conference Paper

    A Dynamic Parameter Tuning Method for High Performance SpMM

    Sparse matrix-matrix multiplication (SpMM) is a basic kernel that is used by many algorithms. Several researches focus on various optimizations for SpMM parallel execution. However, a division of a task for pa...

    Bin Qi, Kazuhiko Komatsu, Masayuki Sato in Parallel and Distributed Computing, Applic… (2021)

  7. No Access

    Chapter and Conference Paper

    Optimization of the Himeno Benchmark for SX-Aurora TSUBASA

    This paper focuses on optimizing the Himeno benchmark for the vector computing system SX-Aurora TSUBASA and analyzes its performance in detail. The Vector Engine (VE) of SX-Aurora TSUBASA achieves a high memor...

    Akito Onodera, Kazuhiko Komatsu, Soya Fujimoto in Benchmarking, Measuring, and Optimizing (2021)

  8. No Access

    Chapter and Conference Paper

    Develo** an Efficient Vector-Friendly Implementation of the Breadth-First Search Algorithm for NEC SX-Aurora TSUBASA

    Breadth-First Search (BFS) is an important computational kernel used as a building-block for many other graph algorithms. Different algorithms and implementation approaches aimed to solve the BFS problem have ...

    Ilya V. Afanasyev, Vladimir V. Voevodin in Parallel Computational Technologies (2020)

  9. Chapter and Conference Paper

    Performance Evaluation of Tsunami Inundation Simulation on SX-Aurora TSUBASA

    As tsunamis may cause damage in wide area, it is difficult to imme...

    Akihiro Musa, Takashi Abe, Takumi Kishitani in Computational Science – ICCS 2019 (2019)

  10. No Access

    Chapter and Conference Paper

    Analysis of Relationship Between SIMD-Processing Features Used in NVIDIA GPUs and NEC SX-Aurora TSUBASA Vector Processors

    This paper presents comprehensive analysis of main SIMD-processing features and computational characteristics of three high performance architectures: two NVIDIA GPU architectures (of Pascal and Volta generati...

    Ilya V. Afanasyev, Vadim V. Voevodin in Parallel Computing Technologies (2019)

  11. Article

    Open Access

    Potential of a modern vector supercomputer for practical applications: performance evaluation of SX-ACE

    Achieving a high sustained simulation performance is the most important concern in the HPC community. To this end, many kinds of HPC system architectures have been proposed, and the diversity of the HPC system...

    Ryusuke Egawa, Kazuhiko Komatsu, Shintaro Momose in The Journal of Supercomputing (2017)

  12. No Access

    Chapter and Conference Paper

    A Compiler-Assisted OpenMP Migration Method Based on Automatic Parallelizing Information

    Performance of a serial code often relies on compilers’ capabilities for automatic parallelization. In such a case, the performance is not portable to a new system because a new compiler on the new system may ...

    Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi in Supercomputing (2014)