Search Results - Springer

Article

A new virtual interpolation technology with range as object

Virtual interpolation technology can be applied to direction-of-arrival (DOA) estimation as a preprocessing technique to achieve the DOA estimation for any array. In order to solve the angle-sensitive problem ...

Tao Li, Yunxiu Yang, Wendong Chen, Qin Shu in Signal, Image and Video Processing (2024)

Article

A full-detection association tracker with confidence optimization for real-time multi-object tracking

Multi-object tracking (MOT) aims to obtain trajectories with unique identifiers for multiple objects in a video stream. In current approaches, confidence thresholds were frequently used to perform multi-stage ...

Youyu Liu, **angxiang Zhou, Zhendong Zhang, Yi Li… in Journal of Real-Time Image Processing (2024)

Article

DML-YOLOv8-SAR image object detection algorithm

Given the challenges posed by noise and varying target scales in SAR images, conventional convolutional neural networks often underperform in SAR image detection. To address this, this paper introduces a novel...

Shuguang Zhao, Ronghao Tao, Fengde Jia in Signal, Image and Video Processing (2024)

Article

Using improved YOLO V5s to recognize tomatoes in a continuous working environment

In the continuous working environment of the picking robots, factors such as illumination change, camera hardware, the movement of the picking robots, and image background interference have a great impact on t...

Guohua Gao, Ciyin Shuai, Shuangyou Wang, Tao Ding in Signal, Image and Video Processing (2024)

Article

A hardware-friendly logarithmic quantization method for CNNs and FPGA implementation

Convolutional Neural Networks (CNNs) have been widely used in various fields due to their high accuracy and efficiency. The performance of CNNs is mainly affected by the computing capability, memory bandwidth,...

Tao Jiang, Ligang **ng, **ming Yu, Junchao Qian in Journal of Real-Time Image Processing (2024)

Article

Shuff-BiseNet: a dual-branch segmentation network for pavement cracks

In order to accurately obtain the shape and size of pavement cracks, analyze the severity of pavement cracks, avoid deterioration of the situation, and take timely measures, we proposed a dual-branch structure...

Haiqun Wang, Bingnan Wang, Tao Zhao in Signal, Image and Video Processing (2024)

Article

Research on image caption generation method based on multi-modal pre-training model and text mixup optimization

In recent years, multi-modal pre-training models have demonstrated remarkable cross-modal representation capabilities, catalyzing the rapid evolution of multi-modal downstream tasks, particularly in image capt...

**g-Tao Sun, Xuan Min in Signal, Image and Video Processing (2024)

Article

YOLO-MTG: a lightweight YOLO model for multi-target garbage detection

With wide adoption of deep learning technology in AI, intelligent garbage detection has become a hot research topic. However, existing datasets currently used for garbage detection rarely involves multi-catego...

Zhongyi **a, Houkui Zhou, Huimin Yu, Haoji Hu… in Signal, Image and Video Processing (2024)

Article

An effective masked transformer network for image denoising

The rising popularity of employing deep learning networks for image denoising can be observed over the past decade. Typically, their exceptional performance is rooted in their ability to learn the map** from...

Shao** Xu, Nan **ao, Wuyong Tao, Changfei Zhou… in Signal, Image and Video Processing (2024)

Article

Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Human falls during ladder climbing are typically instantaneous, making the timely and accurate determination of security risks during ladder climbing a challenging engineering issue. A skeleton-based behaviour...

Wenrui Zhu, Donghui Shi, Rui Cheng, Ruifeng Huang… in Signal, Image and Video Processing (2024)

Article

Boosting image denoising effect via low-level noise injection

In the past decade, supervised denoising models trained on large datasets have demonstrated impressive performance in image denoising due to their superior denoising effect. However, these models lack flexibil...

Jian **ao, **aohui Cheng, Shao** Xu, Wuyong Tao… in Signal, Image and Video Processing (2024)

Article

Accurate and real-time visual detection algorithm for environmental perception of USVS under all-weather conditions

Owing to the intricate and ever-changing nature of the marine environment, traditional marine survey methods are subject to numerous limitations. Unmanned surface vehicles (USVs) have gained significant popula...

Kaiyuan Dong, Tao Liu, Zhen Shi, Yang Zhang in Journal of Real-Time Image Processing (2024)

Article

Particle recognition and shape parameter detection based on deep learning

The size and shape parameters of sand particles are closely related to their geophysical and geomechanical properties. It is challenging to accurately identify sand particles and calculate their shape paramete...

Xuan Li, Zhou Yang, **nyu Tao, **aojie Wang… in Signal, Image and Video Processing (2024)

Chapter and Conference Paper

Semi-End-to-End Nested Named Entity Recognition from Speech

There are two approaches for Named Entity Recognition (NER) from speech: two-step pipeline and End-to-End (E2E). In the pipeline approach, cascading errors are inevitable. In the E2E approach, its annotation m...

Min Zhang, **aoSong Qiao, Yanqing Zhao, Chang Su… in Man-Machine Speech Communication (2024)

Chapter and Conference Paper

IvyGPT: InteractiVe Chinese Pathway Language Model in Medical Domain

General large language models (LLMs) such as ChatGPT have shown remarkable success. However, such LLMs have not been widely adopted for medical purposes, due to poor accuracy and inability to provide medical a...

Rongsheng Wang, Yaofei Duan, ChanTong Lam, Jiexin Chen… in Artificial Intelligence (2024)

Chapter and Conference Paper

Detecting Software Vulnerabilities Based on Hierarchical Graph Attention Network

Detecting software vulnerabilities is a crucial part of software security. At present, the most commonly used methods are to train supervised classification or regression models from the source code to detect ...

Wenlin Xu, Tong Li, **song Wang, Tao Fu, Yahui Tang in Artificial Intelligence (2024)

Chapter and Conference Paper

Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification

Metric learning aims to pull together the samples belonging to the same class and push apart those from different classes in embedding space. Existing methods may suffer from inadequate and low-quality sample ...

Jian-Tao Zhang, Hao-Yu Song, Wu Guo, Yan Song… in Man-Machine Speech Communication (2024)

Chapter and Conference Paper

TST: Time-Sparse Transducer for Automatic Speech Recognition

End-to-end model, especially Recurrent Neural Network Transducer (RNN-T), has achieved great success in speech recognition. However, transducer requires a great memory footprint and computing time when process...

**aohui Zhang, Mangui Liang, Zhengkun Tian, Jiangyan Yi… in Artificial Intelligence (2024)

Chapter and Conference Paper

YueGraph: A Prototype for Yue Opera Lineage Review Based on Knowledge Graph

Yue opera, as one of the representatives of China’s intangible cultural heritage, embodies a profound regional history and folk art. This paper focuses on utilizing knowledge graphs to promote research and pre...

Song** Yang, Fuxiang Fu, Chenxi Zhu, Hao Zeng, Youbing Zhao… in Artificial Intelligence (2024)

Chapter and Conference Paper

MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification

Video super-resolution is a pivotal task that involves the recovery of high-resolution video frames from their low-resolution counterparts, possessing a multitude of applications in real-world scenarios. Withi...

Zixuan Hong, Weipeng Cao, Zhiwu Xu, Zhenru Chen, ** Tao, Zhong Ming… in MultiMedia Modeling (2024)

149 Result(s)

A new virtual interpolation technology with range as object

A full-detection association tracker with confidence optimization for real-time multi-object tracking

DML-YOLOv8-SAR image object detection algorithm

Using improved YOLO V5s to recognize tomatoes in a continuous working environment

A hardware-friendly logarithmic quantization method for CNNs and FPGA implementation

Shuff-BiseNet: a dual-branch segmentation network for pavement cracks

Research on image caption generation method based on multi-modal pre-training model and text mixup optimization

YOLO-MTG: a lightweight YOLO model for multi-target garbage detection

An effective masked transformer network for image denoising

Human risky behaviour recognition during ladder climbing based on multi-modal feature fusion and adaptive graph convolutional network

Boosting image denoising effect via low-level noise injection

Accurate and real-time visual detection algorithm for environmental perception of USVS under all-weather conditions

Particle recognition and shape parameter detection based on deep learning

Semi-End-to-End Nested Named Entity Recognition from Speech

IvyGPT: InteractiVe Chinese Pathway Language Model in Medical Domain

Detecting Software Vulnerabilities Based on Hierarchical Graph Attention Network

Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification

TST: Time-Sparse Transducer for Automatic Speech Recognition

YueGraph: A Prototype for Yue Opera Lineage Review Based on Knowledge Graph

MetaVSR: A Novel Approach to Video Super-Resolution for Arbitrary Magnification

Our Content

Other Sites

Help & Contacts