Skip to main content

previous disabled Page of 2
and
  1. No Access

    Chapter and Conference Paper

    SimdFSM: An Adaptive Vectorization of Finite State Machines for Speculative Execution

    Parallel execution of a Finite State Machine (FSM) is challenging due to strong data dependency. Previous work proposed speculative execution to distribute the workload to multiple threads. While without depen...

    Le Li, Kenjiro Taura in Parallel and Distributed Computing, Applications and Technologies (2023)

  2. No Access

    Chapter and Conference Paper

    PerfMemPlus: A Tool for Automatic Discovery of Memory Performance Problems

    In high-performance computing many performance problems are caused by the memory system. Because such performance bugs are hard to identify, analysis tools play an important role in performance optimization. T...

    Christian Helm, Kenjiro Taura in High Performance Computing (2019)

  3. No Access

    Chapter

    Highly Productive, High-Performance Application Frameworks for Post-Petascale Computing

    We present an overview of our project that aimed to achieve both high performance and high productivity. In order to achieve our aim, we designed and developed high-level domain-specific frameworks that can au...

    Naoya Maruyama, Takayuki Aoki, Kenjiro Taura in Advanced Software Technologies for Post-Pe… (2019)

  4. Chapter and Conference Paper

    SDAC: Porting Scientific Data to Spark RDDs

    Scientific data processing has exposed a range of technical problems in industrial exploration and specific-domain applications due to its huge input volume and data format diversity. While Big Data analytic f...

    Tian Yang, Kenjiro Taura, Liu Chao in Network and Parallel Computing (2017)

  5. No Access

    Chapter and Conference Paper

    Fragmented BWT: An Extended BWT for Full-Text Indexing

    This paper proposes Fragmented Burrows Wheeler Transform (FBWT), an extension to the well-known BWT structure for full-text indexing and searching. A FBWT consists of a number of BWT fragments each covering on...

    Masaru Ito, Hiroshi Inoue, Kenjiro Taura in String Processing and Information Retrieval (2016)

  6. No Access

    Chapter and Conference Paper

    Scaling FMM with Data-Driven OpenMP Tasks on Multicore Architectures

    Poor scalability on parallel architectures can be attributed to several factors, among which idle times, data movement, and runtime overhead are predominant. Conventional parallel loops and nested parallelism hav...

    Abdelhalim Amer, Satoshi Matsuoka, Miquel Pericàs in OpenMP: Memory, Devices, and Tasks (2016)

  7. No Access

    Book

    Concurrent Objects and Beyond

    Papers dedicated to Akinori Yonezawa on the Occasion of His 65th Birthday

    Gul Agha, Atsushi Igarashi, Naoki Kobayashi in Lecture Notes in Computer Science (2014)

  8. No Access

    Chapter

    MassiveThreads: A Thread Library for High Productivity Languages

    An efficient implementation of task parallelism is important for high productivity languages. Specifically, it requires a tasking layer that fulfills following requirements: (i) its performance scales to high ...

    Jun Nakashima, Kenjiro Taura in Concurrent Objects and Beyond (2014)

  9. No Access

    Chapter and Conference Paper

    Analysis of Data Reuse in Task-Parallel Runtimes

    This paper proposes a methodology to study the data reuse quality of task-parallel runtimes. We introduce an coarse-grain version of the reuse distance method called Kernel Reuse Distance (KRD). The metric is a l...

    Miquel Pericàs, Abdelhalim Amer in High Performance Computing Systems. Perfor… (2014)

  10. No Access

    Chapter and Conference Paper

    Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM

    Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications ...

    Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura in Supercomputing (2013)

  11. No Access

    Chapter and Conference Paper

    gluepy: A Simple Distributed Python Programming Framework for Complex Grid Environments

    Problem-solving frameworks in large-scale and wide-area environments must handle connectivity issues (NATs and firewalls), maintain scalability with respect to connection management, accommodate dynamic proces...

    Ken Hironaka, Hideo Saito, Kei Takahashi in Languages and Compilers for Parallel Compu… (2008)

  12. No Access

    Chapter and Conference Paper

    AnZenMail: A Secure and Certified E-mail System

    We are develo** a secure and certified e-mail system AnZenMail that provides an experimental testbed for our cutting-edge security enhancement technologies. In addition to a provably secure message transfer ...

    Etsuya Shibayama, Shigeki Hagihara in Software Security — Theories and Systems (2003)

  13. No Access

    Chapter and Conference Paper

    Fusion of Concurrent Invocations of Exclusive Methods

    This paper describes a mechanism for “fusing” concurrent invocations of exclusive methods. The target of our work is object-oriented languages with concurrent extensions. In the languages, concurrent invocatio...

    Yoshihiro Oyama, Kenjiro Taura, Akinori Yonezawa in Parallel Computing Technologies (2001)

  14. No Access

    Chapter and Conference Paper

    Performance Evaluation of OpenMP Applications with Nested Parallelism

    Many existing OpenMP systems do not sufficiently imple- ment nested parallelism. This is supposedly because nested parallelism is believed to require a significant implementation effort, incur a large overhead...

    Yoshizumi Tanaka, Kenjiro Taura in Languages, Compilers, and Run-Time Systems… (2000)

  15. No Access

    Chapter and Conference Paper

    Online Computation of Critical Paths for Multithreaded Languages

    We have developed an instrumentation scheme that enables programs written in multithreaded languages to compute a critical path at run time. Our scheme gives not only the length (execution time) of the critica...

    Yoshihiro Oyama, Kenjiro Taura, Akinori Yonezawa in Parallel and Distributed Processing (2000)

  16. No Access

    Chapter and Conference Paper

    Comparing Reference Counting and Global Mark-and-Sweep on Parallel Computers

    We compare two dynamic memory management schemes for distributed-memory parallel computers, one based on reference counting and the other based on global mark-and-sweep. We present a simple model in which one ...

    Hirotaka Yamamoto, Kenjiro Taura in Languages, Compilers, and Run-Time Systems… (1998)

  17. Chapter and Conference Paper

    An efficient compilation framework for languages based on a concurrent process calculus

    We propose a framework for compiling programming languages based on concurrent process calculi, in which computation is expressed by a combination of processes and communication channels. Our framework realizes a...

    Yoshihiro Oyama, Kenjiro Taura, Akinori Yonezawa in Euro-Par'97 Parallel Processing (1997)

  18. No Access

    Chapter and Conference Paper

    Schematic: A concurrent object-oriented extension to Scheme

    A concurrent object-oriented extension to the programming language Scheme, called Schematic, is described. Schematic supports familiar constructs often used in typical parallel programs (future and higher-leve...

    Kenjiro Taura, Akinori Yonezawa in Object-Based Parallel and Distributed Computation (1996)

  19. No Access

    Chapter

    Compiling and Managing Concurrent Objects for Efficient Execution on High-Performance MPPs

    High-performance parallel computing on massively parallel processors (MPPs) is one of the most important topics in computer science today. For application-level programming, although most numerical application...

    Satoshi Matsuoka, Masahiro Yasugi in Parallel Language and Compiler Research in… (1995)

  20. No Access

    Chapter and Conference Paper

    StackThreads: An abstract machine for scheduling fine-grain threads on stock CPUs

    We present a software scheduling scheme for fine-grain threads, typical granurality of which is a single procedure invocation. Such fine-grain threads appear in many language implementations such as Multilisp and...

    Kenjiro Taura, Satoshi Matsuoka in Theory and Practice of Parallel Programming (1995)

previous disabled Page of 2