Log in

Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video quality enhancement is a fundamental task in video processing, aiming to improve the visual quality and restore details in degraded video sequences. This paper presents a novel approach that integrates guided image filtering and the U-Net Convolutional Long Short-Term Memory (ConvLSTM) architecture for effective enhancement of video quality. The steering kernel guided filtering (SKGF) technique is employed as a preprocessing step to reduce noise and enhance the visual quality of individual frames in the video. It leverages the guidance information from a high-quality reference frame to filter out noise and enhance details, thus improving the overall quality of the frames. To capture both spatial and temporal information in the video, we employ the U-Net ConvLSTM architecture. The U-Net component serves as an encoder-decoder structure, which effectively captures spatial features and preserves fine details through skip connections. Meanwhile, the ConvLSTM layers enable modeling of temporal dependencies, allowing for the restoration of temporal coherence and motion information in the enhanced video. In our experimental analysis, we assess the performance of the proposed approach using benchmark dataset specifically designed for video quality enhancement tasks. The results demonstrate that the guided image filtering combined with the U-Net ConvLSTM architecture achieves significant improvements in terms of visual quality, restoration of fine details, and preservation of temporal coherence compared to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

Yes.

Code availability

No.

References

  1. Juhn L-S, Tseng L-M (1998) Fast data broadcasting and receiving scheme for popular video service. IEEE Trans Broadcast 44(1):100–105

    Article  Google Scholar 

  2. Haering N, Venetianer PL, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5–6):279–290

    Article  Google Scholar 

  3. Chen S, Simin Yu, Lü J, Chen G, He J (2017) Design and FPGA-based realization of a chaotic secure video communication system. IEEE Trans Circuits Syst Video Technol 28(9):2359–2371

    Article  Google Scholar 

  4. Furini M, Galli G, Martini MC (2020) An online education system to produce and distribute video lectures. Mobile Netw Appl 25(3):969–976

    Article  Google Scholar 

  5. Velasco JPL (2012) Video quality assessment, video compression. In: Punchihewa A (ed), InTech. Available from: http://www.intechopen.com/books/videocompression/video-quality-assessment

  6. Rao Y, Chen L (2012) A survey of video enhancement techniques. J Inf Hiding Multim Signal Process 3(1):71–99

    Google Scholar 

  7. Zadtootaghaj S (2022) Future challenges: enhancement techniques. In: Quality of Experience Modeling for Cloud Gaming Services. Cham: Springer International Publishing, pp 133–140

  8. Dong C, Deng Y, Loy CC, Tang X (2015) Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE international conference on computer vision, pp 576–584

  9. Guo J, Chao H (2016) Building dual-domain representations for compression artifacts reduction. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, pp 628–644

  10. Li K, Bare B, Yan B (2017) An efficient deep convolutional neural networks model for compressed image deblocking. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1320–1325

  11. Tai Y, Yang J, Liu X, Xu C (2017) Memnet: a persistent memory network for image restoration. In: Proceedings of the IEEE international conference on computer vision, pp 4539–4547

  12. Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155

    Article  MathSciNet  Google Scholar 

  13. Chen H, He X, Qing L, **ong S, Nguyen TQ (2018) DPW-SDNet: Dual pixel-wavelet domain deep CNNs for soft decoding of JPEG-compressed images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 711–720

  14. Maggioni M, Boracchi G, Foi A, Egiazarian K (2012) Video denoising, deblocking, and enhancement through separable 4-D nonlocal spatiotemporal transforms. IEEE Trans Image Process 21(9):3952–3966

    Article  MathSciNet  Google Scholar 

  15. Roslin A, Marsh M, Piché N, Provencher B, Mitchell TR, Onederra IA, Leonardi CR (2022) Processing of micro-CT images of granodiorite rock samples using convolutional neural networks (CNN), Part I: Super-resolution enhancement using a 3D CNN. Miner Eng 188:107748

    Article  Google Scholar 

  16. Venkatesan R, Pandiaraj A, Selvakumar M (2023) A Recurrent Neural Network for Image Deblocking Detection and Quality Enhancement. In: 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT). IEEE, pp 1–8

  17. Yang R, Xu M, Wang Z, Li T (2018) Multi-frame quality enhancement for compressed video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6664–6673

  18. Guan Z, **ng Q, Mai Xu, Yang R, Liu T, Wang Z (2019) MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video. IEEE Trans Pattern Anal Mach Intell 43(3):949–963

    Article  Google Scholar 

  19. Xu Y, Gao L, Tian K, Zhou S, Sun H (2019) Non-local convlstm for video compression artifact reduction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7043–7052

  20. Deng J, Wang Li, Shiliang Pu, Zhuo C (2020) Spatio-temporal deformable convolution for compressed video quality enhancement. Proc AAAI Conf Artif Intell 34(07):10696–10703

    Google Scholar 

  21. Zhao M, Xu Y, Zhou S (2021) Recursive fusion and deformable spatiotemporal attention for video compression artifact reduction. In: Proceedings of the 29th ACM international conference on multimedia, pp 5646–5654

  22. Yang R (2021) NTIRE 2021 challenge on quality enhancement of compressed video: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 647–666

  23. Arias P, Morel J-M (2018) Video denoising via empirical bayesian estimation of space-time patches. J Math Imaging Vis 60:70–93

    Article  MathSciNet  Google Scholar 

  24. Danielyan A, Katkovnik V, Egiazarian K (2011) BM3D frames and variational image deblurring. IEEE Trans Image Process 21(4):1715–1728

    Article  MathSciNet  Google Scholar 

  25. Buades A, Coll B, Morel J-M (2011) Non-local means denoising. Image Process On Line 1:208–212

    Article  Google Scholar 

  26. Takano N, Alaghband G (2019) Srgan: Training dataset matters. ar**v preprint ar**v:1903.09922

  27. Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14. Springer International Publishing, pp 391–407

  28. Niklaus S, Mai L, Liu F (2017) Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 670–679

  29. Nah S, Son S, Lee KM (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8102–8111

  30. Singh G, Jaggi N, Vasamsetti S, Sardana HK, Kumar S, Mittal N (2015) Underwater image/video enhancement using wavelet based color correction (WBCC) method. In: 2015 IEEE Underwater Technology (UT). IEEE, pp 1–5

  31. Mameli F, Bertini M, Galteri L, Del Bimbo A (2021) A NoGAN approach for image and video restoration and compression artifact removal. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 9326–9332

  32. Dai Y, Liu D, Wu F (2017) A convolutional neural network approach for post-processing in HEVC intra coding. In: MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4–6, 2017, Proceedings, Part I 23. Springer International Publishing, pp 28–39

  33. Wang T, Chen M, Chao H (2017) A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In: 2017 Data Compression Conference (DCC). IEEE, pp 410–419

  34. Brandi F, de Queiroz R, Mukherjee D (2008) Super-resolution of video using key frames and motion estimation. In: 2008 15th IEEE International Conference on Image Processing. IEEE, pp 321–324

  35. Song BC, Jeong S-C, Choi Y (2010) Video super-resolution algorithm using bi-directional overlapped block motion compensation and on-the-fly dictionary training. IEEE Trans Circuits Syst Video Technol 21(3):274–285

    Article  Google Scholar 

  36. Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv Neural Inf Process Syst 28

  37. Kappeler A, Yoo S, Dai Q, Katsaggelos AK (2016) Video super-resolution with convolutional neural networks. IEEE Trans Comput Imaging 2(2):109–122

    Article  MathSciNet  Google Scholar 

  38. Meng X, Deng X, Zhu S, Zhang X, Zeng B (2020) A robust quality enhancement method based on joint spatial-temporal priors for video coding. IEEE Trans Circuits Syst Video Technol 31(6):2401–2414

    Article  Google Scholar 

  39. He K, Sun J, Tang X (2012) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409

    Article  Google Scholar 

  40. Li Z, Zheng J, Zhu Z, Yao W, Shiqian Wu (2014) Weighted guided image filtering. IEEE Trans Image Process 24(1):120–129

    MathSciNet  Google Scholar 

  41. Sun Z, Han Bo, Li J, Zhang J, Gao X (2019) Weighted guided image filtering with steering kernel. IEEE Trans Image Process 29:500–508

    Article  MathSciNet  Google Scholar 

  42. Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545

  43. Zhao H, Gallo O, Frosio I, Kautz J (2016) Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 3(1):47–57

    Article  Google Scholar 

  44. Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. ar**v preprint ar**v:1511.05440

  45. http://www.cse.cuhk.edu.hk/leojia/projects/DeepSR/. Accessed 2 Feb 2023

  46. Luo D, Ye M, Li S, Li X (2022) Coarse-to-fine spatio-temporal information fusion for compressed video quality enhancement. IEEE Signal Process Lett 29:543–547

    Article  Google Scholar 

  47. Liu J, Zhou M, **ao M (2022) Deformable convolution dense network for compressed video quality enhancement. In: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP), pp 1930–1934

  48. Huang Z, Sun J, Guo X (2023) FastCNN: Towards fast and accurate spatiotemporal network for HEVC compressed video enhancement. ACM Trans Multimed Comput Commun Appl 19(3):1–22

    Google Scholar 

  49. Yu L, Chang W, Wu S, Gabbouj M (2023) End-to-end transformer for compressed video quality enhancement. IEEE Trans Broadcasting 70(1):97–207

Download references

Funding

No.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sachin Chourasia.

Ethics declarations

Ethics approval

Yes.

Consent to participate

Yes.

Consent for publication

Yes.

Conflicts of interest

No.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chourasia, S., Patel, P. & Jain, P.K. Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19636-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-19636-4

Keywords

Navigation