Search
Search Results
-
High-performance simulations of turbulent boundary layer flow using Intel Xeon Phi many-core processors
Direct numerical simulations (DNS) of turbulent flows have increasing importance because they not only provide fundamental understanding of turbulent...
-
Performance benchmarking of deep learning framework on Intel Xeon Phi
With the success of deep learning (DL) methods in diverse application domains, several deep learning software frameworks have been proposed to...
-
Comparison of HPC Architectures for Computing All-Pairs Shortest Paths. Intel Xeon Phi KNL vs NVIDIA Pascal
Today, one of the main challenges for high-performance computing systems is to improve their performance by kee** energy consumption at acceptable... -
A server-side accelerator framework for multi-core CPUs and Intel Xeon Phi co-processor systems
Processing-intensive web server requests can lead to low Quality of Service (QoS), such as longer mean response time and lower throughput, which...
-
Revisiting the performance optimization of QR factorization on Intel KNL and SKL multiprocessors
This study focused on the optimization of double-precision general matrix–matrix multiplication (DGEMM) routine to improve the QR factorization...
-
Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors
In high-performance computing, the general matrix-matrix multiplication (xGEMM) routine is the core of the Level 3 BLAS kernel for effective...
-
Performance Evaluation of Pseudospectral Ultrasound Simulations on a Cluster of Xeon Phi Accelerators
The rapid development of novel procedures in medical ultrasonics, including treatment planning in therapeutic ultrasound and image reconstruction in... -
Implementation of Parallel 3-D Real FFT with 2-D Decomposition on Intel Xeon Phi Clusters
In this paper, we propose an implementation of a parallel 3-D real fast Fourier transform (FFT) with 2-D decomposition on Intel Xeon Phi clusters.... -
Accelerating time series motif discovery in the Intel Xeon Phi KNL processor
Time series analysis is an important research topic of great interest in many fields. Recently, the Matrix Profile method, and particularly one of...
-
Enhanced OpenMP Algorithm to Compute All-Pairs Shortest Path on X86 Architectures
Graphs have become a key tool when modeling and solving problems in different areas. The Floyd-Warshall (FW) algorithm computes the shortest path... -
Performance Analysis of a Parallel Denoising Algorithm on Intel Xeon Computer System
This paper presents an experimental performance study of a parallel implementation of the Poissonian image restoration algorithm. Hybrid... -
Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques
The Black-Scholes option pricing problem is one of the widely used financial benchmarks. We explore the possibility of develo** a high-performance... -
Optimization of heterogeneous systems with AI planning heuristics and machine learning: a performance and energy aware approach
Heterogeneous computing systems provide high performance and energy efficiency. However, to optimally utilize such systems, solutions that distribute...
-
Performance Analysis of Deep Learning Inference in Convolutional Neural Networks on Intel Cascade Lake CPUs
The paper aims to compare the performance of deep convolutional network inference. Experiments are carried out on a high-end server with two Intel... -
A Novel Algorithm for Bi-objective Performance-Energy Optimization of Applications with Continuous Performance and Linear Energy Profiles on Heterogeneous HPC Platforms
Performance and energy are the two most important objectives for optimization on heterogeneous HPC platforms. This work studies a mathematical... -
Numerical Modeling of Hydrodynamic Turbulence with Self-gravity on Intel Xeon Phi KNL
In this paper, we present the results of numerical simulations of hydrodynamic turbulence with self-gravity, employing the latest Intel Xeon Phi... -
A fully-customized dataflow engine for 3D earthquake simulation with a complex topography
With HPC (high performance computing) evolving into the exascale era, improvements in computing performance and power efficiency have become...
-
Fast solution of electromagnetic scattering problems using Xeon Phi coprocessors
Electromagnetic scattering problems can be solved by discretizing and transforming integral equations into matrix equations using the method of...
-
A System-Wide Communication to Couple Multiple MPI Programs for Heterogeneous Computing
This paper proposes a system-wide communication library to couple multiple MPI programs for heterogeneous coupling computing called... -
Performance and Scalability Analysis of AI-Accelerated CFD Simulations Across Various Computing Platforms
In this paper, we perform an extensive benchmarking and analysis of the performance and scalability of our software tool called CFD suite, which...