We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.
Filters applied:

Search Results

Showing 1-20 of 9,033 results
  1. Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism

    Exploiting instruction level parallelism (ILP) is a widely used method for increasing performance of processors. While traditional very long...
    Kari Hepola, Joonas Multanen, Pekka Jääskeläinen in Architecture of Computing Systems
    Conference paper 2022
  2. POAS: a framework for exploiting accelerator level parallelism in heterogeneous environments

    In the era of heterogeneous computing, a new paradigm called accelerator level parallelism (ALP) has emerged. In ALP, accelerators are used...

    Pablo Antonio Martínez, Gregorio Bernabé, José Manuel García in The Journal of Supercomputing
    Article Open access 25 March 2024
  3. Expressing Parallelism

    Chapter 4 marks the transition from simple teaching examples toward real-world parallel code and expands upon details of the code samples we have...
    James Reinders, Ben Ashbaugh, ... **nmin Tian in Data Parallel C++
    Chapter Open access 2023
  4. Generalizing Hierarchical Parallelism

    Since the days of OpenMP 1.0 computer hardware has become more complex, typically by specializing compute units for coarse- and fine-grained...
    Conference paper 2023
  5. A neural network-based approach for the performance evaluation of branch prediction in instruction-level parallelism processors

    Branch prediction is essential for improving the performance of pipeline processors. As the number of pipeline stages in modern processors increases,...

    Sweety Nain, Prachi Chaudhary in The Journal of Supercomputing
    Article 14 September 2021
  6. An efficient branch predictor for improved accuracy of instruction level parallelism

    The need for modern processors is based on fast and precise branch predictors to improve the execution of instructions in the pipeline. In a parallel...

    Sweety, Prachi Chaudhary in The Journal of Supercomputing
    Article 06 April 2021
  7. Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads

    Computer vision (CV) algorithms have been extensively used for a myriad of applications nowadays. As the multimedia data are generally well-formatted...

    Ruo-Shi Li, ** Peng, ... Ran Zheng in Journal of Computer Science and Technology
    Article 31 July 2023
  8. Research on Instruction Pipeline Optimization Oriented to RISC-V Vector Instruction Set

    Traditional general-purpose processors are scalar processors, and only one data result is obtained when an instruction is executed. But nowadays,...
    Conference paper 2022
  9. The C++ Standard Library for Parallelism and Concurrency (HPX)

    We describe the C++ standard library for concurrency and parallelism (HPX). In contrast to bulk sequential programs written in the past, HPX relies...
    Patrick Diehl, Steven R. Brandt, Hartmut Kaiser in Parallel C++
    Chapter 2024
  10. An Adaptive Instruction Set Encoding Automatic Generation Method for VLIW

    The tight integration of hardware and software enables very long instruction word (VLIW) architectures to vastly outperform superscalar architectures...
    Conference paper 2024
  11. A Multi-level Parallel Integer/Floating-Point Arithmetic Architecture for Deep Learning Instructions

    The extensive instruction-set for deep learning (DL) significantly enhances the performance of general-purpose architectures by exploiting data-level...
    Hongbing Tan, **g Zhang, ... Liquan **ao in Euro-Par 2023: Parallel Processing
    Conference paper 2023
  12. Parallel fractal image compression using quadtree partition with task and dynamic parallelism

    Fractal image compression is a lossy compression technique based on the iterative function system, which can be used to reduce the storage space and...

    Francisco J. Hernandez-Lopez, Omar Muñiz-Pérez in Journal of Real-Time Image Processing
    Article 08 January 2022
  13. Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures

    Stencil computations within a single core or multicores of an SMP node have been over-investigated. However, the demands on HPC’s higher performance...

    Kaifang Zhang, Huayou Su, Yong Dou in The Journal of Supercomputing
    Article 28 April 2021
  14. Using FPGA-based content-addressable memory for mnemonics instruction searching in assembler design

    Memories play an essential role in computer systems as they store and retrieve data that may include instructions required for system operation. In...

    Halit Öztekin, Abdelkader Lazzem, İhsan Pehlivan in The Journal of Supercomputing
    Article 07 May 2023
  15. High-Level Synthesis

    High-level synthesis (HLS) is the process of compiling a software program into a digital circuit. This chapter provides a view into the HLS design...
    Kaihui Tu, **fan Tang, ... Zhufei Chu in FPGA EDA
    Chapter 2024
  16. High-Level Decision Diagrams

    In this chapter, we generalize the decision-based concept of logic-level SSBDDs to apply it for modelling digital systems at higher abstraction...
    Raimund Ubar, Jaan Raik, ... Artur Jutman in Structural Decision Diagrams in Digital Test
    Chapter 2024
  17. Efficient High-Level Programming in Plain Java

    This paper introduces the support for develo** efficient parallel programs in plain Java in the Gaspar framework. The framework supports a complete...

    Rui S. Silva, João L. Sobral in International Journal of Parallel Programming
    Article 05 December 2022
  18. Concurrency and Parallelism

    A concurrent program handles more than one task at a time. A familiar example is a web server that handles multiple client requests at the same time....
    Martin Kalin in Modern C Up and Running
    Chapter 2022
  19. Rapid Prototy** of Complex Micro-architectures Through High-Level Synthesis

    Register-Transfer Level (RTL) design has been a traditional approach in hardware design for several decades. However, with the growing complexity of...
    Sara Sadat Hoseininasab, Caroline Collange, Steven Derrien in Applied Reconfigurable Computing. Architectures, Tools, and Applications
    Conference paper 2023
  20. IDaTPA: importance degree based thread partitioning approach in thread level speculation

    As an auto-parallelization technique with the level of thread on multi-core, Thread-Level Speculation (TLS) which is also called Speculative...

    Li Yuxiang, Zhang Zhiyong, ... Su Yaning in Discover Computing
    Article Open access 19 June 2024
Did you find what you were looking for? Share feedback.