![Loading...](https://link.springer.com/static/c4a417b97a76cc2980e3c25e2271af3129e08bbe/images/pdf-preview/spacer.gif)
-
Chapter and Conference Paper
LC-MEMENTO: A Memory Model for Accelerated Architectures
With the advent of heterogeneous architectures, in particular, with the ubiquity of multi-GPU systems, it is becoming increasingly important to manage device memory efficiently in order to reap the benefits of...
-
Chapter and Conference Paper
Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading
In this paper, we have developed a novel methodology that takes into consideration multithreaded many-core designs to better utilize memory/processing resources and improve memory residence on tileable applica...
-
Chapter and Conference Paper
TL-DAE: Thread-Level Decoupled Access/Execution for OpenMP on the Cyclops-64 Many-Core Processor
Cyclops-64 is a many-core processor with software managed memory hierarchy. For OpenMP programs running on this processor, a frequently used computing paradigm is: (i) copy data into on-chip memory; (ii) perfo...
-
Chapter and Conference Paper
Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP
Tiling is widely used by compilers and programmer to optimize scientific and engineering code for better performance. Many parallel programming languages support tile/tiling directly through first-class langua...
-
Chapter and Conference Paper
Tile Percolation: An OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor
Programming a multicore processor is difficult. It is even more difficult if the processor has software-managed memory hierarchy, e.g. the IBM Cyclops-64 (C64). A widely accepted parallel programming solution ...