Search Results - Springer

Sort By Newest First Oldest First

Chapter and Conference Paper

SWIRL ++ : Evaluating Performance Models to Guide Code Transformation in Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are ubiquitous in applications ranging from self-driving cars to various branches of health care. CPUs with large core counts and wide SIMD support are used in HPC clusters...

Tharindu R. Patabandi, Anand Venkat… in Languages and Compilers for Parallel Compu… (2021)
Chapter and Conference Paper

Mozart : Efficient Composition of Library Functions for Heterogeneous Execution

Current processor trend is to couple a commodity processor with a GPU, a co-processor, or an accelerator. To unleash the full computational power of such heterogeneous systems is a daunting task: programmers o...

Rajkishore Barik, Tatiana Shpeisman… in Languages and Compilers for Parallel Compu… (2019)
Article

Accelerating Data Analytics on Integrated GPU Platforms via Runtime Specialization

Integrated GPU systems are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potent...

Naila Farooqui, Indrajit Roy, Yuan Chen… in International Journal of Parallel Programm… (2018)
Chapter and Conference Paper

Using Dynamic Compilation to Achieve Ninja Performance for CNN Training on Many-Core Processors

Convolutional Neural Networks (CNNs) represent a class of Deep Neural Networks that is growing in importance due to their state-of-the-art performance in pattern recognition tasks in various domains, including...

Ankush Mandal, Rajkishore Barik, Vivek Sarkar in Euro-Par 2018: Parallel Processing (2018)

Download PDF (2259 KB) View Chapter
Chapter and Conference Paper

Compiler-Driven Data Layout Transformation for Heterogeneous Platforms

Modern heterogeneous systems comprise of CPU cores, GPU cores, and in some cases, accelerator cores. Each of these computational cores have very different memory hierarchies, making it challenging to efficient...

Deepak Majeti, Rajkishore Barik… in Euro-Par 2013: Parallel Processing Worksho… (2014)

Download PDF (354 KB)
Chapter and Conference Paper

Inter-iteration Scalar Replacement Using Array SSA Form

In this paper, we introduce novel simple and efficient analysis algorithms for scalar replacement and dead store elimination that are built on Array SSA form, a uniform representation for capturing control and...

Rishi Surendran, Rajkishore Barik, Jisheng Zhao, Vivek Sarkar in Compiler Construction (2014)

Download PDF (362 KB)
Chapter and Conference Paper

Static Detection of Place Locality and Elimination of Runtime Checks

Harnessing parallelism particularly for high performance computing is a demanding topic of research. Limitations and complexities of automatic parallelization have led to programming language notations wherein...

Shivali Agarwal, RajKishore Barik… in Programming Languages and Systems (2008)
Chapter and Conference Paper

Extended Linear Scan: An Alternate Foundation for Global Register Allocation

In this paper, we extend past work on Linear Scan register allocation, and propose two Extended Linear Scan (ELS) algorithms that retain the compile-time efficiency of past Linear Scan algorithms while delivering...

Vivek Sarkar, Rajkishore Barik in Compiler Construction (2007)

Download PDF (382 KB)
Chapter and Conference Paper

Optimal Bitwise Register Allocation Using Integer Linear Programming

This paper addresses the problem of optimal global register allocation. The register allocation problem is expressed as an integer linear programming problem and solved optimally. The model is more flexible th...

Rajkishore Barik, Christian Grothoff… in Languages and Compilers for Parallel Compu… (2007)
Chapter and Conference Paper

Efficient Computation of May-Happen-in-Parallel Information for Concurrent Java Programs

Modeling of runtime threads in static analysis of concurrent programs plays an important role in both reducing the complexity and improving the precision of the analysis. Modeling based on type based technique...

Rajkishore Barik in Languages and Compilers for Parallel Computing (2006)
Chapter and Conference Paper

Enhanced Bitwidth-Aware Register Allocation

Embedded processors depend on register files for performance, just like general-purpose processors in desktop and server systems. However, unlike general-purpose processors, the power consumption of register f...

Rajkishore Barik, Vivek Sarkar in Compiler Construction (2006)

Download PDF (365 KB)
Chapter and Conference Paper

An Efficient Algorithm to Compute Delay Set in SPMD Programs

We present compiler analysis for single program multiple data (SPMD) programs that communicate through shared address space. The choice of memory consistency model is sequential consistency as defined by Lampo...

Manish P. Kurhekar, Rajkishore Barik, Umesh Kumar in High Performance Computing - HiPC 2003 (2003)

12 Result(s)

SWIRL ++ : Evaluating Performance Models to Guide Code Transformation in Convolutional Neural Networks

Mozart : Efficient Composition of Library Functions for Heterogeneous Execution

Accelerating Data Analytics on Integrated GPU Platforms via Runtime Specialization

Using Dynamic Compilation to Achieve Ninja Performance for CNN Training on Many-Core Processors

Compiler-Driven Data Layout Transformation for Heterogeneous Platforms

Inter-iteration Scalar Replacement Using Array SSA Form

Static Detection of Place Locality and Elimination of Runtime Checks

Extended Linear Scan: An Alternate Foundation for Global Register Allocation

Optimal Bitwise Register Allocation Using Integer Linear Programming

Efficient Computation of May-Happen-in-Parallel Information for Concurrent Java Programs

Enhanced Bitwidth-Aware Register Allocation

An Efficient Algorithm to Compute Delay Set in SPMD Programs

Our Content

Other Sites

Help & Contacts