671,993 Result(s)

within Yu** ** Artificial Intelligence

Sort By Newest First Oldest First

Book Series

Synthesis Lectures on Computer Vision
Chapter

Transformer-Driven Models for Language, Vision, and Multimodality

In this chapter, we will learn about the modeling and learning techniques that drive multimodal applications. We will focus specifically on the recent advances in transformer-based modeling for natural languag...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Multimodal Content Generation

In this chapter, we will review the advances that are being made in this new field of multimodal content generation and also discuss several challenges associated with this emerging technology. First, we will ...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Outlook

While multimodal information retrieval has several exciting applications and a high potential for impact on important problems, there are several challenges associated with the information that lives on the in...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Introduction

In this book, our emphasis is on multimodal information retrieval, specifically concentrating on text and image data. The traditional unimodal systems, limited to a single type of data, often fall short of cap...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Book

Advances in Multimodal Information Retrieval and Generation

Man Luo, Tejas Gokhale… in Synthesis Lectures on Computer Vision (2025)
Chapter

Multimodal Information Retrieval

In today’s rapidly evolving digital landscape, the wealth of available information has expanded beyond the boundaries of traditional text-based content. With the proliferation of multimedia platforms and data ...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Chapter

Retrieval Augmented Modeling

Till this point in our book, we have discussed the fundamental principles of information retrieval, exploring its key elements, and various approaches to achieving effective retrieval, including multimodal ret...

Man Luo, Tejas Gokhale, Neeraj Varshney… in Advances in Multimodal Information Retriev… (2025)
Article

Open Access

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across

Otto Brookes, Majid Mirmehdi, Colleen Stephens… in International Journal of Computer Vision (2024)

Download PDF (10032 KB) View Article
Article

CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection

Recently, segmentation-based methods are quite popular in scene text detection, which mainly contain two steps: text kernel segmentation and expansion. However, the segmentation process only considers each pix...

** Zhao, Wei Feng, Zheng Zhang, **g**g Lv… in International Journal of Computer Vision (2024)
Article

The multi-criteria evaluation of research efforts based on ETL software: from business intelligence approach to big data and semantic approaches

Many industries and academia have devoted a lot of effort and money to creating and/or using good extract-transform-load (ETL) software suitable for their data analysis purposes since it is considered a key to...

Chaimae Boulahia, Hicham Behja, Mohammed Reda Chbihi Louhdi… in Evolutionary Intelligence (2024)
Article

Open Access

Signifiers for conveying and exploiting affordances: from human-computer interaction to multi-agent systems

The ecological psychologist James J. Gibson defined the notion of affordances to refer to what action possibilities environments offer to animals. In this paper, we show how (artificial) agents can discover an...

Jérémy Lemée, Danai Vachtsevanou… in Annals of Mathematics and Artificial Intel… (2024)

Download PDF (642 KB)
Article

A general framework for improving cuckoo search algorithms with resource allocation and re-initialization

Cuckoo search (CS) has currently become one of the most favorable meta-heuristic algorithms (MHAs). In this article, a simple yet effective framework is proposed for CS algorithms to reinforce their performanc...

Qiangda Yang, Yongxu Chen, Jie Zhang… in International Journal of Machine Learning … (2024)
Article

Open Access

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent sel...

Guofeng Mei, Cristiano Saltori, Elisa Ricci… in International Journal of Computer Vision (2024)

Download PDF (2814 KB) View Article
Article

Surrogate-assisted evolutionary optimisation: a novel blueprint and a state of the art survey

Surrogate-Assisted Evolutionary Optimisation algorithms are a specialized brand of optimisers developed to undertake problems with computationally expensive fitness functions. These algorithms work by building...

Mohammed Imed Eddine Khaldi, Amer Draa in Evolutionary Intelligence (2024)
Article

Tensor discriminant analysis on grassmann manifold with application to video based human action recognition

Representing videos as linear subspaces on Grassmann manifolds has made great strides in action recognition problems. Recent studies have explored the convenience of discriminant analysis by making use of Gras...

Cagri Ozdemir, Randy C. Hoover, Kyle Caudle… in International Journal of Machine Learning … (2024)
Article

Open Access

An algorithmic debugging approach for belief-desire-intention agents

Debugging agent systems can be rather difficult. It is often noted as one of the most time-consuming tasks during the development of cognitive agents. Algorithmic (or declarative) debugging is a semi-automatic...

Tobias Ahlbrecht in Annals of Mathematics and Artificial Intelligence (2024)

Download PDF (1178 KB)
Article

ConDA: state-based data augmentation for context-dependent text-to-SQL

The context-dependent text-to-SQL task has profound real-world implications, as it facilitates users in extracting knowledge from vast databases, which allows users to acquire the information interactively for...

Dingzirui Wang, Longxu Dou, Wanxiang Che… in International Journal of Machine Learning … (2024)
Article

Annotation-Free Human Sketch Quality Assessment

As lovely as bunnies are, your sketched version would probably not do them justice (Fig. 1). This paper recognises this very problem and studies sketch quality assessment for the first time—letting you find these...

Lan Yang, Kaiyue Pang, Honggang Zhang… in International Journal of Computer Vision (2024)
Article

Open Set Recognition in Real World

Open set recognition (OSR) constitutes a critical endeavor within the domain of computer vision, frequently deployed in applications, such as autonomous driving and medical imaging recognition. Existing OSR me...

Zhen Yang, Jun Yue, Pedram Ghamisi… in International Journal of Computer Vision (2024)

671,993 Result(s)

Synthesis Lectures on Computer Vision

Transformer-Driven Models for Language, Vision, and Multimodality

Multimodal Content Generation

Outlook

Introduction

Advances in Multimodal Information Retrieval and Generation

Multimodal Information Retrieval

Retrieval Augmented Modeling

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection

The multi-criteria evaluation of research efforts based on ETL software: from business intelligence approach to big data and semantic approaches

Signifiers for conveying and exploiting affordances: from human-computer interaction to multi-agent systems

A general framework for improving cuckoo search algorithms with resource allocation and re-initialization

Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

Surrogate-assisted evolutionary optimisation: a novel blueprint and a state of the art survey

Tensor discriminant analysis on grassmann manifold with application to video based human action recognition

An algorithmic debugging approach for belief-desire-intention agents

ConDA: state-based data augmentation for context-dependent text-to-SQL

Annotation-Free Human Sketch Quality Assessment

Open Set Recognition in Real World

Our Content

Other Sites

Help & Contacts