Search Page | SpringerLink

Dual-IS: Instruction Set Modality for Efficient Instruction Level Parallelism

Exploiting instruction level parallelism (ILP) is a widely used method for increasing performance of processors. While traditional very long...

Kari Hepola, Joonas Multanen, Pekka Jääskeläinen in Architecture of Computing Systems

Conference paper 2022

POAS: a framework for exploiting accelerator level parallelism in heterogeneous environments

In the era of heterogeneous computing, a new paradigm called accelerator level parallelism (ALP) has emerged. In ALP, accelerators are used...

Pablo Antonio Martínez, Gregorio Bernabé, José Manuel García in The Journal of Supercomputing

Article Open access 25 March 2024

Expressing Parallelism

Chapter 4 marks the transition from simple teaching examples toward real-world parallel code and expands upon details of the code samples we have...

James Reinders, Ben Ashbaugh, ... **nmin Tian in Data Parallel C++

Chapter Open access 2023

Generalizing Hierarchical Parallelism

Since the days of OpenMP 1.0 computer hardware has become more complex, typically by specializing compute units for coarse- and fine-grained...

Michael Kruse in OpenMP: Advanced Task-Based, Device and Compiler Programming

Conference paper 2023

A neural network-based approach for the performance evaluation of branch prediction in instruction-level parallelism processors

Branch prediction is essential for improving the performance of pipeline processors. As the number of pipeline stages in modern processors increases,...

Sweety Nain, Prachi Chaudhary in The Journal of Supercomputing

Article 14 September 2021

An efficient branch predictor for improved accuracy of instruction level parallelism

The need for modern processors is based on fast and precise branch predictors to improve the execution of instructions in the pipeline. In a parallel...

Sweety, Prachi Chaudhary in The Journal of Supercomputing

Article 06 April 2021

Evaluating RISC-V Vector Instruction Set Architecture Extension with Computer Vision Workloads

Computer vision (CV) algorithms have been extensively used for a myriad of applications nowadays. As the multimedia data are generally well-formatted...

Ruo-Shi Li, ** Peng, ... Ran Zheng in Journal of Computer Science and Technology

Article 31 July 2023

Research on Instruction Pipeline Optimization Oriented to RISC-V Vector Instruction Set

Traditional general-purpose processors are scalar processors, and only one data result is obtained when an instruction is executed. But nowadays,...

Zhen Zhang, **n Yu in Advances in Artificial Intelligence and Security

Conference paper 2022

The C++ Standard Library for Parallelism and Concurrency (HPX)

We describe the C++ standard library for concurrency and parallelism (HPX). In contrast to bulk sequential programs written in the past, HPX relies...

Patrick Diehl, Steven R. Brandt, Hartmut Kaiser in Parallel C++

Chapter 2024

An Adaptive Instruction Set Encoding Automatic Generation Method for VLIW

The tight integration of hardware and software enables very long instruction word (VLIW) architectures to vastly outperform superscalar architectures...

**n **ao, Zhong Liu in Algorithms and Architectures for Parallel Processing

Conference paper 2024

A Multi-level Parallel Integer/Floating-Point Arithmetic Architecture for Deep Learning Instructions

The extensive instruction-set for deep learning (DL) significantly enhances the performance of general-purpose architectures by exploiting data-level...

Hongbing Tan, **g Zhang, ... Liquan **ao in Euro-Par 2023: Parallel Processing

Conference paper 2023

Parallel fractal image compression using quadtree partition with task and dynamic parallelism

Fractal image compression is a lossy compression technique based on the iterative function system, which can be used to reduce the storage space and...

Francisco J. Hernandez-Lopez, Omar Muñiz-Pérez in Journal of Real-Time Image Processing

Article 08 January 2022

Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures

Stencil computations within a single core or multicores of an SMP node have been over-investigated. However, the demands on HPC’s higher performance...

Kaifang Zhang, Huayou Su, Yong Dou in The Journal of Supercomputing

Article 28 April 2021

Using FPGA-based content-addressable memory for mnemonics instruction searching in assembler design

Memories play an essential role in computer systems as they store and retrieve data that may include instructions required for system operation. In...

Halit Öztekin, Abdelkader Lazzem, İhsan Pehlivan in The Journal of Supercomputing

Article 07 May 2023

High-Level Synthesis

High-level synthesis (HLS) is the process of compiling a software program into a digital circuit. This chapter provides a view into the HLS design...

Kaihui Tu, **fan Tang, ... Zhufei Chu in FPGA EDA

Chapter 2024

High-Level Decision Diagrams

In this chapter, we generalize the decision-based concept of logic-level SSBDDs to apply it for modelling digital systems at higher abstraction...

Raimund Ubar, Jaan Raik, ... Artur Jutman in Structural Decision Diagrams in Digital Test

Chapter 2024

Efficient High-Level Programming in Plain Java

This paper introduces the support for develo** efficient parallel programs in plain Java in the Gaspar framework. The framework supports a complete...

Rui S. Silva, João L. Sobral in International Journal of Parallel Programming

Article 05 December 2022

Concurrency and Parallelism

A concurrent program handles more than one task at a time. A familiar example is a web server that handles multiple client requests at the same time....

Martin Kalin in Modern C Up and Running

Chapter 2022

Rapid Prototy** of Complex Micro-architectures Through High-Level Synthesis

Register-Transfer Level (RTL) design has been a traditional approach in hardware design for several decades. However, with the growing complexity of...

Sara Sadat Hoseininasab, Caroline Collange, Steven Derrien in Applied Reconfigurable Computing. Architectures, Tools, and Applications

Conference paper 2023

IDaTPA: importance degree based thread partitioning approach in thread level speculation

As an auto-parallelization technique with the level of thread on multi-core, Thread-Level Speculation (TLS) which is also called Speculative...

Li Yuxiang, Zhang Zhiyong, ... Su Yaning in Discover Computing

Article Open access 19 June 2024

Search

Filters

Search Results

Search

Navigation