Skip to main content

and
  1. No Access

    Article

    OV-DAR: Open-Vocabulary Object Detection and Attributes Recognition

    In this paper, we endeavor to localize all potential objects in an image and infer their visual categories, attributes, and shapes, even in instances where certain objects have not been encompassed in the mode...

    Keyan Chen, **aolong Jiang, Haochen Wang in International Journal of Computer Vision (2024)

  2. No Access

    Article

    OV-VIS: Open-Vocabulary Video Instance Segmentation

    Conventionally, the goal of Video Instance Segmentation (VIS) is to segment and categorize objects in videos from a closed set of training categories, lacking the generalization ability to handle novel categor...

    Haochen Wang, Cilin Yan, Keyan Chen in International Journal of Computer Vision (2024)

  3. No Access

    Chapter and Conference Paper

    A Novel Ensemble Approach for Click-Through Rate Prediction Based on Factorization Machines and Gradient Boosting Decision Trees

    Click-Through Rate (CTR) prediction is a significant technique in the field of computational advertising, its accuracy directly affects companies profits and user experience. Achieving great ability of general...

    **aochen Wang, Gang Hu, Haoyang Lin, Jiayu Sun in Web and Big Data (2019)

  4. No Access

    Chapter and Conference Paper

    Spectral Tilt Estimation for Speech Intelligibility Enhancement Using RNN Based on All-Pole Model

    Speech intelligibility enhancement is extremely meaningful for successful speech communication in noisy environments. Several methods based on Lombard effect are used to increase intelligibility. In those meth...

    Rui Zhang, Ruimin Hu, Gang Li, **aochen Wang in MultiMedia Modeling (2019)

  5. No Access

    Chapter and Conference Paper

    Head Related Transfer Function Interpolation Based on Aligning Operation

    Head related transfer function (HRTF) is the main technique of binaural synthesis, which is used to reconstruct spatial sound image, and the HRTF data only can be obtained by measurement. A high resolution HRT...

    Tingzhao Wu, Ruimin Hu, **aochen Wang in Advances in Multimedia Information Process… (2016)

  6. No Access

    Chapter and Conference Paper

    Research on Perception Sensitivity of Elevation Angle in 3D Sound Field

    The development of virtual reality and three-dimensional (3D) video inspired the concern about 3D audio, 3D audio aims at reconstructing the spatial information of original signals, the spatial perception sens...

    Yafei Wu, **aochen Wang, Cheng Yang, Ge Gao in Advances in Multimedia Information Process… (2016)

  7. No Access

    Chapter and Conference Paper

    Real-Life Voice Activity Detection Based on Audio-Visual Alignment

    Voice activity detection (VAD) is a technology to identify whether the persons in multimedia are speaking. Most of the research efforts focused on utilizing audio and visual information to implement voice acti...

    ** Wang, Chao Liang, **aochen Wang in Advances in Multimedia Information Process… (2015)

  8. No Access

    Chapter and Conference Paper

    Simplification of 3D Multichannel Sound System Based on Multizone Soundfield Reproduction

    Home sound environments are becoming increasingly important to the entertainment and audio industries. Compared with single zone soundfield reproduction, 3D spatial multizone soundfield reproduction is a more ...

    Bowei Fang, **aochen Wang, Song Wang in Advances in Multimedia Information Process… (2015)

  9. No Access

    Chapter and Conference Paper

    3D Panning Based Sound Field Enhancement Method for Ambisonics

    When conventional first order Ambisonics system uses four loudspeaker with platonic solid layout to reconstruct sound field, the 3D acoustic field effect is limited. A new signal distribution method is propose...

    Song Wang, Ruimin Hu, Shihong Chen in Advances in Multimedia Information Process… (2015)

  10. No Access

    Chapter and Conference Paper

    Multichannel Simplification Based on Deviation of Loudspeaker Positions

    People hope to achieve a good impression of three-dimensional (3D) spatial sound with fewer loudspeakers at home. The present method simplified the amount of loudspeakers based on the minimum area enclosed by ...

    Dengshi Li, Ruimin Hu, in Advances in Multimedia Information Process… (2015)

  11. No Access

    Chapter and Conference Paper

    Low Bitrates Audio Bandwidth Extension Using a Deep Auto-Encoder

    Modern audio coding technologies apply methods of bandwidth extension (BWE) to efficiently represent audio data at low bitrates. An established method is the well-known spectral band replication (SBR) that can...

    Lin Jiang, Ruimin Hu, **aochen Wang in Advances in Multimedia Information Process… (2015)

  12. No Access

    Chapter and Conference Paper

    Physical Properties of Sound Field Based Estimation of Phantom Source in 3D

    3D spatial sound effects can be achieved by amplitude panning with several loudspeakers, which can produce the auditory event of phantom source at arbitrary location with loudspeakers at arbitrary locations in...

    Shanfa Ke, **aochen Wang, Li Gao in Advances in Multimedia Information Process… (2015)

  13. No Access

    Chapter and Conference Paper

    Multi-channel Object-Based Spatial Parameter Compression Approach for 3D Audio

    To improve the spatial precision of three-dimensional (3D) audio, the bit rates of spatial parameters are increased sharply. This paper presents a spatial parameters compression approach to decrease the bit ra...

    Cheng Yang, Ruimin Hu, Liuyue Su in Advances in Multimedia Information Process… (2015)

  14. No Access

    Chapter and Conference Paper

    Reduction of Multichannel Sound System Based on Spherical Harmonics

    In order to meet people’s demand for 3D audio in family, it’s a critical problem to recreate a 3D spatial sound field with few loudspeakers. In this paper, we introduce a L- to (L-1)-channel reduction method base...

    Shanshan Yang, **aochen Wang, Dengshi Li in Advances in Multimedia Information Process… (2014)

  15. No Access

    Chapter and Conference Paper

    Automatic Multichannel Simplification with Low Impacts on Sound Pressure at Ears

    People hope to use minimum arrangement of loudspeakers to reproduce the experience of the film 3D sound at home. Although the Ando’s conversion can convert n- to m-channel sound system by maintaining the sound pr...

    Dengshi Li, Ruimin Hu, in Advances in Multimedia Information Process… (2014)