We are improving our search experience. To check which content you have full access to, or for advanced search, go back to the old search.

Search

Please fill in this field.
Filters applied:

Search Results

Showing 121-140 of 4,997 results
  1. SNN vs. CNN Implementations on FPGAs: An Empirical Evaluation

    Convolutional Neural Networks (CNNs) are widely employed to solve various problems, e.g., image classification. Due to their compute- and...
    Patrick Plagwitz, Frank Hannig, ... Oliver Keszocze in Applied Reconfigurable Computing. Architectures, Tools, and Applications
    Conference paper 2024
  2. A Survey of Algorithmic and Hardware Optimization Techniques for Vision Convolutional Neural Networks on FPGAs

    In today’s world, the applications of convolutional neural networks (CNN) are limitless and are employed in numerous fields. The CNNs get wider and...

    Arish Sateesan, Sharad Sinha, ... A. P. Vinod in Neural Processing Letters
    Article 05 April 2021
  3. FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shuffling

    Nowadays, with the increasing depth of CNNs, the number of computation and storage requirements with weights expands significantly, preventing their...

    **aohui Wei, **nyang Zheng, ... Hengshan Yue in CCF Transactions on High Performance Computing
    Article 26 May 2023
  4. SNCL: a supernode OpenCL implementation for hybrid computing arrays

    Heterogeneous computing has been develo** continuously in the field of high-performance computing because of its high performance and energy...

    Tao Tang, Kai Lu, ... Yifei Guo in The Journal of Supercomputing
    Article 08 December 2023
  5. DyPipe: A Holistic Approach to Accelerating Dynamic Neural Networks with Dynamic Pipelining

    Dynamic neural network (NN) techniques are increasingly important because they facilitate deep learning techniques with more complex network...

    Yi-Min Zhuang, **ng Hu, ... Tian Zhi in Journal of Computer Science and Technology
    Article 31 July 2023
  6. Evaluation of HPC Workloads Running on Open-Source RISC-V Hardware

    The emerging RISC-V ecosystem has the potential to improve the speed, fidelity, and quality of hardware/software co-design R &D activities. However,...
    Luc Berger-Vergiat, Suma G. Cardwell, ... Kevin Pedretti in High Performance Computing
    Conference paper 2023
  7. A Flexible Mixed-Mesh FPGA Cluster Architecture for High Speed Computing

    This paper focuses on integrating multiple FPGAs for High-Performance Computing (HPC) applications with a priority on computational capability and...
    Sergio Pertuz, Cornelia Wulf, ... Diana Göhringer in Applied Reconfigurable Computing. Architectures, Tools, and Applications
    Conference paper 2024
  8. STANN – Synthesis Templates for Artificial Neural Network Inference and Training

    While Deep Learning accelerators have been a research area of high interest, the focus was usually on monolithic accelerators for the inference of...
    Marc Rothmann, Mario Porrmann in Advances in Computational Intelligence
    Conference paper 2023
  9. QPU integration in OpenCL for heterogeneous programming

    The integration of quantum processing units (QPUs) in a heterogeneous high-performance computing environment requires solutions that facilitate...

    Jorge Vázquez-Pérez, César Piñeiro, ... Andrés Gómez in The Journal of Supercomputing
    Article Open access 31 January 2024
  10. FPGA-Based Hardware/Software Codesign for Video Encoder on IoT Edge Platforms

    Recently, image/video-based applications have been widely used for many domains, such as traffic, medical, or robotics. In this context, IoT-based...
    Conference paper 2023
  11. Survey on storage-accelerator data movement

    The processor and the main memory in the traditional computing system cannot satisfy the requirements of the emerging large-scale applications in...

    Zixuan Zhou, Shushu Yi, Jie Zhang in CCF Transactions on High Performance Computing
    Article 21 July 2022
  12. An Optimization Technique for PMF Estimation in Approximate Circuits

    As an emerging computing technology, approximate computing enables computing systems to utilize hardware resources efficiently. Recently, approximate...

    Yu-Qin Dou, Cheng-Hua Wang in Journal of Computer Science and Technology
    Article 30 March 2023
  13. DOE: database offloading engine for accelerating SQL processing

    The CPU-Accelerator heterogeneous systems have demonstrated performance and efficiency benefits on DBMSs. However, the CPU-Cache-DRAM architecture...

    Hao Kong, Wenyan Lu, ... **aowei Li in Distributed and Parallel Databases
    Article 13 May 2023
  14. Compiler-Assisted Operator Template Library for DNN Accelerators

    Despite many dedicated accelerators are gaining popularity for their performance and energy efficiency in the deep neural network (DNN) domain,...
    Jiansong Li, Wei Cao, ... **aobing Feng in Network and Parallel Computing
    Conference paper 2021
  15. HFPQ: deep neural network compression by hardware-friendly pruning-quantization

    This paper presents a hardware-friendly compression method for deep neural networks. This method effectively combines layered channel pruning with...

    YingBo Fan, Wei Pang, ShengLi Lu in Applied Intelligence
    Article 23 February 2021
  16. SWG: an architecture for sparse weight gradient computation

    On-device training for deep neural networks (DNN) has become a trend due to various user preferences and scenarios. The DNN training process consists...

    Weiwei Wu, Fengbin Tu, ... Shouyi Yin in Science China Information Sciences
    Article 23 January 2024
  17. In-Depth Analysis of OLAP Query Performance on Heterogeneous Hardware

    Classical database systems are now facing the challenge of processing high-volume data feeds at unprecedented rates as efficiently as possible while...

    David Broneske, Anna Drewes, ... Gunter Saake in Datenbank-Spektrum
    Article Open access 26 July 2021
  18. Accelerating OCaml Programs on FPGA

    This paper aims to exploit the massive parallelism of Field-Programmable Gate Arrays (FPGAs) by programming them in OCaml, a multiparadigm and...

    Loïc Sylvestre, Emmanuel Chailloux, Jocelyn Sérot in International Journal of Parallel Programming
    Article 24 January 2023
  19. Hetero-Vis: A Framework for Latency Optimized Heterogeneous Deployment of Convolutional Neural Networks

    Convolutional Neural Network (CNN) models often comprise multiple layers varying in compute requirements. For deployment, a number of hardware...
    Nupur Sumeet, Karan Rawat, ... Rekha Singhal in Euro-Par 2022: Parallel Processing Workshops
    Conference paper 2023
  20. Distributed Calculations with Algorithmic Skeletons for Heterogeneous Computing Environments

    Contemporary HPC hardware typically provides several levels of parallelism, e.g. multiple nodes, each having multiple cores (possibly with...

    Nina Herrmann, Herbert Kuchen in International Journal of Parallel Programming
    Article Open access 07 January 2023
Did you find what you were looking for? Share feedback.