-
Chapter and Conference Paper
SWIRL ++ : Evaluating Performance Models to Guide Code Transformation in Convolutional Neural Networks
Convolutional Neural Networks (CNNs) are ubiquitous in applications ranging from self-driving cars to various branches of health care. CPUs with large core counts and wide SIMD support are used in HPC clusters...
-
Chapter and Conference Paper
Mozart : Efficient Composition of Library Functions for Heterogeneous Execution
Current processor trend is to couple a commodity processor with a GPU, a co-processor, or an accelerator. To unleash the full computational power of such heterogeneous systems is a daunting task: programmers o...
-
Article
Accelerating Data Analytics on Integrated GPU Platforms via Runtime Specialization
Integrated GPU systems are a cost-effective and energy-efficient option for accelerating data-intensive applications. While these platforms have reduced overhead of offloading computation to the GPU and potent...
-
Chapter and Conference Paper
An Efficient Algorithm to Compute Delay Set in SPMD Programs
We present compiler analysis for single program multiple data (SPMD) programs that communicate through shared address space. The choice of memory consistency model is sequential consistency as defined by Lampo...