Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement

Chourasia, Sachin; Patel, Prabhat; Jain, Prashant Kumar

doi:10.1007/s11042-024-19636-4

Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement

Published: 19 June 2024

(2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Sachin Chourasia¹,
Prabhat Patel¹ &
Prashant Kumar Jain¹

34 Accesses
Explore all metrics

Abstract

Video quality enhancement is a fundamental task in video processing, aiming to improve the visual quality and restore details in degraded video sequences. This paper presents a novel approach that integrates guided image filtering and the U-Net Convolutional Long Short-Term Memory (ConvLSTM) architecture for effective enhancement of video quality. The steering kernel guided filtering (SKGF) technique is employed as a preprocessing step to reduce noise and enhance the visual quality of individual frames in the video. It leverages the guidance information from a high-quality reference frame to filter out noise and enhance details, thus improving the overall quality of the frames. To capture both spatial and temporal information in the video, we employ the U-Net ConvLSTM architecture. The U-Net component serves as an encoder-decoder structure, which effectively captures spatial features and preserves fine details through skip connections. Meanwhile, the ConvLSTM layers enable modeling of temporal dependencies, allowing for the restoration of temporal coherence and motion information in the enhanced video. In our experimental analysis, we assess the performance of the proposed approach using benchmark dataset specifically designed for video quality enhancement tasks. The results demonstrate that the guided image filtering combined with the U-Net ConvLSTM architecture achieves significant improvements in terms of visual quality, restoration of fine details, and preservation of temporal coherence compared to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

TempFormer: Temporally Consistent Transformer for Video Denoising

Dual Attention with the Self-Attention Alignment for Efficient Video Super-resolution

Article 15 May 2021

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-Ahead Forward Ones

Data availability

Yes.

Code availability

No.

References

Juhn L-S, Tseng L-M (1998) Fast data broadcasting and receiving scheme for popular video service. IEEE Trans Broadcast 44(1):100–105
Article Google Scholar
Haering N, Venetianer PL, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5–6):279–290
Article Google Scholar
Chen S, Simin Yu, Lü J, Chen G, He J (2017) Design and FPGA-based realization of a chaotic secure video communication system. IEEE Trans Circuits Syst Video Technol 28(9):2359–2371
Article Google Scholar
Furini M, Galli G, Martini MC (2020) An online education system to produce and distribute video lectures. Mobile Netw Appl 25(3):969–976
Article Google Scholar
Velasco JPL (2012) Video quality assessment, video compression. In: Punchihewa A (ed), InTech. Available from: http://www.intechopen.com/books/videocompression/video-quality-assessment
Rao Y, Chen L (2012) A survey of video enhancement techniques. J Inf Hiding Multim Signal Process 3(1):71–99
Google Scholar
Zadtootaghaj S (2022) Future challenges: enhancement techniques. In: Quality of Experience Modeling for Cloud Gaming Services. Cham: Springer International Publishing, pp 133–140
Dong C, Deng Y, Loy CC, Tang X (2015) Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE international conference on computer vision, pp 576–584
Guo J, Chao H (2016) Building dual-domain representations for compression artifacts reduction. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, pp 628–644
Li K, Bare B, Yan B (2017) An efficient deep convolutional neural networks model for compressed image deblocking. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1320–1325
Tai Y, Yang J, Liu X, Xu C (2017) Memnet: a persistent memory network for image restoration. In: Proceedings of the IEEE international conference on computer vision, pp 4539–4547
Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155
Article MathSciNet Google Scholar
Chen H, He X, Qing L, **ong S, Nguyen TQ (2018) DPW-SDNet: Dual pixel-wavelet domain deep CNNs for soft decoding of JPEG-compressed images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 711–720
Maggioni M, Boracchi G, Foi A, Egiazarian K (2012) Video denoising, deblocking, and enhancement through separable 4-D nonlocal spatiotemporal transforms. IEEE Trans Image Process 21(9):3952–3966
Article MathSciNet Google Scholar
Roslin A, Marsh M, Piché N, Provencher B, Mitchell TR, Onederra IA, Leonardi CR (2022) Processing of micro-CT images of granodiorite rock samples using convolutional neural networks (CNN), Part I: Super-resolution enhancement using a 3D CNN. Miner Eng 188:107748
Article Google Scholar
Venkatesan R, Pandiaraj A, Selvakumar M (2023) A Recurrent Neural Network for Image Deblocking Detection and Quality Enhancement. In: 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT). IEEE, pp 1–8
Yang R, Xu M, Wang Z, Li T (2018) Multi-frame quality enhancement for compressed video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6664–6673
Guan Z, **ng Q, Mai Xu, Yang R, Liu T, Wang Z (2019) MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video. IEEE Trans Pattern Anal Mach Intell 43(3):949–963
Article Google Scholar
Xu Y, Gao L, Tian K, Zhou S, Sun H (2019) Non-local convlstm for video compression artifact reduction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7043–7052
Deng J, Wang Li, Shiliang Pu, Zhuo C (2020) Spatio-temporal deformable convolution for compressed video quality enhancement. Proc AAAI Conf Artif Intell 34(07):10696–10703
Google Scholar
Zhao M, Xu Y, Zhou S (2021) Recursive fusion and deformable spatiotemporal attention for video compression artifact reduction. In: Proceedings of the 29th ACM international conference on multimedia, pp 5646–5654
Yang R (2021) NTIRE 2021 challenge on quality enhancement of compressed video: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 647–666
Arias P, Morel J-M (2018) Video denoising via empirical bayesian estimation of space-time patches. J Math Imaging Vis 60:70–93
Article MathSciNet Google Scholar
Danielyan A, Katkovnik V, Egiazarian K (2011) BM3D frames and variational image deblurring. IEEE Trans Image Process 21(4):1715–1728
Article MathSciNet Google Scholar
Buades A, Coll B, Morel J-M (2011) Non-local means denoising. Image Process On Line 1:208–212
Article Google Scholar
Takano N, Alaghband G (2019) Srgan: Training dataset matters. ar**v preprint ar**v:1903.09922
Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14. Springer International Publishing, pp 391–407
Niklaus S, Mai L, Liu F (2017) Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 670–679
Nah S, Son S, Lee KM (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8102–8111
Singh G, Jaggi N, Vasamsetti S, Sardana HK, Kumar S, Mittal N (2015) Underwater image/video enhancement using wavelet based color correction (WBCC) method. In: 2015 IEEE Underwater Technology (UT). IEEE, pp 1–5
Mameli F, Bertini M, Galteri L, Del Bimbo A (2021) A NoGAN approach for image and video restoration and compression artifact removal. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 9326–9332
Dai Y, Liu D, Wu F (2017) A convolutional neural network approach for post-processing in HEVC intra coding. In: MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4–6, 2017, Proceedings, Part I 23. Springer International Publishing, pp 28–39
Wang T, Chen M, Chao H (2017) A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In: 2017 Data Compression Conference (DCC). IEEE, pp 410–419
Brandi F, de Queiroz R, Mukherjee D (2008) Super-resolution of video using key frames and motion estimation. In: 2008 15th IEEE International Conference on Image Processing. IEEE, pp 321–324
Song BC, Jeong S-C, Choi Y (2010) Video super-resolution algorithm using bi-directional overlapped block motion compensation and on-the-fly dictionary training. IEEE Trans Circuits Syst Video Technol 21(3):274–285
Article Google Scholar
Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv Neural Inf Process Syst 28
Kappeler A, Yoo S, Dai Q, Katsaggelos AK (2016) Video super-resolution with convolutional neural networks. IEEE Trans Comput Imaging 2(2):109–122
Article MathSciNet Google Scholar
Meng X, Deng X, Zhu S, Zhang X, Zeng B (2020) A robust quality enhancement method based on joint spatial-temporal priors for video coding. IEEE Trans Circuits Syst Video Technol 31(6):2401–2414
Article Google Scholar
He K, Sun J, Tang X (2012) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409
Article Google Scholar
Li Z, Zheng J, Zhu Z, Yao W, Shiqian Wu (2014) Weighted guided image filtering. IEEE Trans Image Process 24(1):120–129
MathSciNet Google Scholar
Sun Z, Han Bo, Li J, Zhang J, Gao X (2019) Weighted guided image filtering with steering kernel. IEEE Trans Image Process 29:500–508
Article MathSciNet Google Scholar
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545
Zhao H, Gallo O, Frosio I, Kautz J (2016) Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 3(1):47–57
Article Google Scholar
Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. ar**v preprint ar**v:1511.05440
http://www.cse.cuhk.edu.hk/leojia/projects/DeepSR/. Accessed 2 Feb 2023
Luo D, Ye M, Li S, Li X (2022) Coarse-to-fine spatio-temporal information fusion for compressed video quality enhancement. IEEE Signal Process Lett 29:543–547
Article Google Scholar
Liu J, Zhou M, **ao M (2022) Deformable convolution dense network for compressed video quality enhancement. In: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP), pp 1930–1934
Huang Z, Sun J, Guo X (2023) FastCNN: Towards fast and accurate spatiotemporal network for HEVC compressed video enhancement. ACM Trans Multimed Comput Commun Appl 19(3):1–22
Google Scholar
Yu L, Chang W, Wu S, Gabbouj M (2023) End-to-end transformer for compressed video quality enhancement. IEEE Trans Broadcasting 70(1):97–207

Download references

Funding

No.

Author information

Authors and Affiliations

University, Institute of Technology, Rajiv Gandhi Proudyogiki Vishwavidyalaya, Bhopal, India
Sachin Chourasia, Prabhat Patel & Prashant Kumar Jain

Authors

Sachin Chourasia
View author publications
You can also search for this author in PubMed Google Scholar
Prabhat Patel
View author publications
You can also search for this author in PubMed Google Scholar
Prashant Kumar Jain
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sachin Chourasia.

Ethics declarations

Ethics approval

Yes.

Consent to participate

Yes.

Consent for publication

Yes.

Conflicts of interest

No.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Chourasia, S., Patel, P. & Jain, P.K. Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19636-4

Download citation

Received: 25 August 2023
Revised: 08 April 2024
Accepted: 07 June 2024
Published: 19 June 2024
DOI: https://doi.org/10.1007/s11042-024-19636-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TempFormer: Temporally Consistent Transformer for Video Denoising

Dual Attention with the Self-Attention Alignment for Efficient Video Super-resolution

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-Ahead Forward Ones

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TempFormer: Temporally Consistent Transformer for Video Denoising

Dual Attention with the Self-Attention Alignment for Efficient Video Super-resolution

Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-Ahead Forward Ones

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethics approval

Consent to participate

Consent for publication

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation