1,125,147 Result(s)

within **n**g **a Computer Science

Sort By Newest First Oldest First

Book Series

Synthesis Lectures on Computer Vision
Chapter

Transformer-Driven Models for Language, Vision, and Multimodality

In this chapter, we will learn about the modeling and learning techniques that drive multimodal applications. We will focus specifically on the recent advances in transformer-based modeling for natural languag...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Multimodal Content Generation

In this chapter, we will review the advances that are being made in this new field of multimodal content generation and also discuss several challenges associated with this emerging technology. First, we will ...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Outlook

While multimodal information retrieval has several exciting applications and a high potential for impact on important problems, there are several challenges associated with the information that lives on the in...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Introduction

In this book, our emphasis is on multimodal information retrieval, specifically concentrating on text and image data. The traditional unimodal systems, limited to a single type of data, often fall short of cap...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Book

Advances in Multimodal Information Retrieval and Generation

Man Luo, Tejas Gokhale… in Synthesis Lectures on Computer Vision (2025)
Chapter

Multimodal Information Retrieval

In today’s rapidly evolving digital landscape, the wealth of available information has expanded beyond the boundaries of traditional text-based content. With the proliferation of multimedia platforms and data ...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Retrieval Augmented Modeling

Till this point in our book, we have discussed the fundamental principles of information retrieval, exploring its key elements, and various approaches to achieving effective retrieval, including multimodal ret...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Article

A stabilized Crank-Nicolson virtual element method for the unsteady Navier-Stokes problems with high Reynolds number

This paper studies a stabilized virtual element method for the unsteady Navier-Stokes problems on polygonal meshes. Using “equal-order” virtual elements in space and the Crank-Nicolson scheme in time, we give ...

Yang Li, Yanhong Bai, Minfu Feng in Numerical Algorithms (2024)
Article

Conv-ViT fusion for improved handwritten Arabic character classification

An essential aspect of pattern recognition pertains to handwriting recognition, particularly in languages with diverse character styles like Arabic. Arabic characters present a challenge due to their varied wr...

Sarra Rouabhi, Abdennour Azerine, Redouane Tlemsani… in Signal, Image and Video Processing (2024)
Article

Open Access

Performance evaluation of Word2vec accelerators exploiting spatial and temporal parallelism on DDR/HBM-based FPGAs

Word embedding is a technique for representing words as vectors in a way that captures their semantic and syntactic relationships. The processing time of one of the most popular word embedding technique Word2v...

Hasitha Muthumala Waidyasooriya, Masanori Hariyama in The Journal of Supercomputing (2024)

Download PDF (4268 KB) View Article
Article

A learning-based efficient query model for blockchain in internet of medical things

This paper proposes a learning-based model for the resource-constrained edge nodes in the blockchain-enabled Internet of Medical Things (IoMT) systems to realize efficient querying. Three layers are designed i...

Dayu Jia, Guanghong Yang, Min Huang, Junchang **n… in The Journal of Supercomputing (2024)
Article

Open Access

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across

Otto Brookes, Majid Mirmehdi, Colleen Stephens… in International Journal of Computer Vision (2024)

Download PDF (10032 KB) View Article
Article

Fpga-based SoC design for real-time facial point detection using deep convolutional neural networks with dynamic partial reconfiguration

Deep convolutional neural networks (DCNNs) have been mainly powerful and important artificial intelligence techniques, which are exploited in various computer vision applications, such as facial point detectio...

Safa Teboulbi, Seifeddine Messaoud… in Signal, Image and Video Processing (2024)
Article

MS-HRNet: multi-scale high-resolution network for human pose estimation

Human pose estimation has important applications in medical diagnosis (such as early diagnosis of autism in children and assisting with the diagnosis of Parkinson’s disease), human-computer interaction, animat...

Yanxia Wang, Renjie Wang, Hu Shi, Dan Liu in The Journal of Supercomputing (2024)
Article

CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection

Recently, segmentation-based methods are quite popular in scene text detection, which mainly contain two steps: text kernel segmentation and expansion. However, the segmentation process only considers each pix...

** Zhao, Wei Feng, Zheng Zhang, **g**g Lv… in International Journal of Computer Vision (2024)
Article

Enhancing image steganalysis via integrated reinforcement learning and dilated convolution techniques

In the wake of unparalleled expansion in digital communication platforms, the imperative to bolster security and privacy measures has escalated. Within this landscape, image steganalysis emerges as a pivotal d...

Yuan Sun in Signal, Image and Video Processing (2024)
Article

\(H^{1}\) -norm error analysis of a robust ADI method on graded mesh for three-dimensional subdiffusion problems

This work proposes a robust ADI scheme on graded mesh for solving three-dimensional subdiffusion problems. The Caputo fractional derivative is discretized by L1 scheme, where the graded mesh is used to elimina...

Ziyi Zhou, Haixiang Zhang, Xuehua Yang in Numerical Algorithms (2024)
Article

Open Access

Signifiers for conveying and exploiting affordances: from human-computer interaction to multi-agent systems

The ecological psychologist James J. Gibson defined the notion of affordances to refer to what action possibilities environments offer to animals. In this paper, we show how (artificial) agents can discover an...

Jérémy Lemée, Danai Vachtsevanou… in Annals of Mathematics and Artificial Intel… (2024)

Download PDF (642 KB)
Article

Publisher Correction: Improving query processing in blockchain systems by using a multi-level sharding mechanism

Alemeh Matani, Amir Sahafi, Ali Broumandnia in The Journal of Supercomputing (2024)

Download PDF (366 KB) View Article

1,125,147 Result(s)

Synthesis Lectures on Computer Vision

Transformer-Driven Models for Language, Vision, and Multimodality

Multimodal Content Generation

Outlook

Introduction

Advances in Multimodal Information Retrieval and Generation

Multimodal Information Retrieval

Retrieval Augmented Modeling

A stabilized Crank-Nicolson virtual element method for the unsteady Navier-Stokes problems with high Reynolds number

Conv-ViT fusion for improved handwritten Arabic character classification

Performance evaluation of Word2vec accelerators exploiting spatial and temporal parallelism on DDR/HBM-based FPGAs

A learning-based efficient query model for blockchain in internet of medical things

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

Fpga-based SoC design for real-time facial point detection using deep convolutional neural networks with dynamic partial reconfiguration

MS-HRNet: multi-scale high-resolution network for human pose estimation

CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection

Enhancing image steganalysis via integrated reinforcement learning and dilated convolution techniques

\(H^{1}\) -norm error analysis of a robust ADI method on graded mesh for three-dimensional subdiffusion problems

Signifiers for conveying and exploiting affordances: from human-computer interaction to multi-agent systems

Publisher Correction: Improving query processing in blockchain systems by using a multi-level sharding mechanism

Our Content

Other Sites

Help & Contacts