Skip to main content

and
  1. No Access

    Chapter and Conference Paper

    LC-MEMENTO: A Memory Model for Accelerated Architectures

    With the advent of heterogeneous architectures, in particular, with the ubiquity of multi-GPU systems, it is becoming increasingly important to manage device memory efficiently in order to reap the benefits of...

    Kiran Ranganath, Jesun Firoz in Languages and Compilers for Parallel Compu… (2022)

  2. No Access

    Chapter and Conference Paper

    Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading

    In this paper, we have developed a novel methodology that takes into consideration multithreaded many-core designs to better utilize memory/processing resources and improve memory residence on tileable applica...

    Sunil Shrestha, Joseph Manzano in Languages and Compilers for Parallel Compu… (2015)

  3. No Access

    Chapter and Conference Paper

    TL-DAE: Thread-Level Decoupled Access/Execution for OpenMP on the Cyclops-64 Many-Core Processor

    Cyclops-64 is a many-core processor with software managed memory hierarchy. For OpenMP programs running on this processor, a frequently used computing paradigm is: (i) copy data into on-chip memory; (ii) perfo...

    Ge Gan, Joseph Manzano in Languages and Compilers for Parallel Computing (2010)

  4. No Access

    Chapter and Conference Paper

    Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP

    Tiling is widely used by compilers and programmer to optimize scientific and engineering code for better performance. Many parallel programming languages support tile/tiling directly through first-class langua...

    Ge Gan, Xu Wang, Joseph Manzano in Evolving OpenMP in an Age of Extreme Paral… (2009)

  5. Chapter and Conference Paper

    Tile Percolation: An OpenMP Tile Aware Parallelization Technique for the Cyclops-64 Multicore Processor

    Programming a multicore processor is difficult. It is even more difficult if the processor has software-managed memory hierarchy, e.g. the IBM Cyclops-64 (C64). A widely accepted parallel programming solution ...

    Ge Gan, Xu Wang, Joseph Manzano, Guang R. Gao in Euro-Par 2009 Parallel Processing (2009)