Abstract
Video quality enhancement is a fundamental task in video processing, aiming to improve the visual quality and restore details in degraded video sequences. This paper presents a novel approach that integrates guided image filtering and the U-Net Convolutional Long Short-Term Memory (ConvLSTM) architecture for effective enhancement of video quality. The steering kernel guided filtering (SKGF) technique is employed as a preprocessing step to reduce noise and enhance the visual quality of individual frames in the video. It leverages the guidance information from a high-quality reference frame to filter out noise and enhance details, thus improving the overall quality of the frames. To capture both spatial and temporal information in the video, we employ the U-Net ConvLSTM architecture. The U-Net component serves as an encoder-decoder structure, which effectively captures spatial features and preserves fine details through skip connections. Meanwhile, the ConvLSTM layers enable modeling of temporal dependencies, allowing for the restoration of temporal coherence and motion information in the enhanced video. In our experimental analysis, we assess the performance of the proposed approach using benchmark dataset specifically designed for video quality enhancement tasks. The results demonstrate that the guided image filtering combined with the U-Net ConvLSTM architecture achieves significant improvements in terms of visual quality, restoration of fine details, and preservation of temporal coherence compared to existing methods.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs11042-024-19636-4/MediaObjects/11042_2024_19636_Fig13_HTML.png)
Similar content being viewed by others
Data availability
Yes.
Code availability
No.
References
Juhn L-S, Tseng L-M (1998) Fast data broadcasting and receiving scheme for popular video service. IEEE Trans Broadcast 44(1):100–105
Haering N, Venetianer PL, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5–6):279–290
Chen S, Simin Yu, Lü J, Chen G, He J (2017) Design and FPGA-based realization of a chaotic secure video communication system. IEEE Trans Circuits Syst Video Technol 28(9):2359–2371
Furini M, Galli G, Martini MC (2020) An online education system to produce and distribute video lectures. Mobile Netw Appl 25(3):969–976
Velasco JPL (2012) Video quality assessment, video compression. In: Punchihewa A (ed), InTech. Available from: http://www.intechopen.com/books/videocompression/video-quality-assessment
Rao Y, Chen L (2012) A survey of video enhancement techniques. J Inf Hiding Multim Signal Process 3(1):71–99
Zadtootaghaj S (2022) Future challenges: enhancement techniques. In: Quality of Experience Modeling for Cloud Gaming Services. Cham: Springer International Publishing, pp 133–140
Dong C, Deng Y, Loy CC, Tang X (2015) Compression artifacts reduction by a deep convolutional network. In: Proceedings of the IEEE international conference on computer vision, pp 576–584
Guo J, Chao H (2016) Building dual-domain representations for compression artifacts reduction. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer International Publishing, pp 628–644
Li K, Bare B, Yan B (2017) An efficient deep convolutional neural networks model for compressed image deblocking. In: 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, pp 1320–1325
Tai Y, Yang J, Liu X, Xu C (2017) Memnet: a persistent memory network for image restoration. In: Proceedings of the IEEE international conference on computer vision, pp 4539–4547
Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Trans Image Process 26(7):3142–3155
Chen H, He X, Qing L, **ong S, Nguyen TQ (2018) DPW-SDNet: Dual pixel-wavelet domain deep CNNs for soft decoding of JPEG-compressed images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 711–720
Maggioni M, Boracchi G, Foi A, Egiazarian K (2012) Video denoising, deblocking, and enhancement through separable 4-D nonlocal spatiotemporal transforms. IEEE Trans Image Process 21(9):3952–3966
Roslin A, Marsh M, Piché N, Provencher B, Mitchell TR, Onederra IA, Leonardi CR (2022) Processing of micro-CT images of granodiorite rock samples using convolutional neural networks (CNN), Part I: Super-resolution enhancement using a 3D CNN. Miner Eng 188:107748
Venkatesan R, Pandiaraj A, Selvakumar M (2023) A Recurrent Neural Network for Image Deblocking Detection and Quality Enhancement. In: 2023 5th International Conference on Smart Systems and Inventive Technology (ICSSIT). IEEE, pp 1–8
Yang R, Xu M, Wang Z, Li T (2018) Multi-frame quality enhancement for compressed video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6664–6673
Guan Z, **ng Q, Mai Xu, Yang R, Liu T, Wang Z (2019) MFQE 2.0: A new approach for multi-frame quality enhancement on compressed video. IEEE Trans Pattern Anal Mach Intell 43(3):949–963
Xu Y, Gao L, Tian K, Zhou S, Sun H (2019) Non-local convlstm for video compression artifact reduction. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 7043–7052
Deng J, Wang Li, Shiliang Pu, Zhuo C (2020) Spatio-temporal deformable convolution for compressed video quality enhancement. Proc AAAI Conf Artif Intell 34(07):10696–10703
Zhao M, Xu Y, Zhou S (2021) Recursive fusion and deformable spatiotemporal attention for video compression artifact reduction. In: Proceedings of the 29th ACM international conference on multimedia, pp 5646–5654
Yang R (2021) NTIRE 2021 challenge on quality enhancement of compressed video: Methods and results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 647–666
Arias P, Morel J-M (2018) Video denoising via empirical bayesian estimation of space-time patches. J Math Imaging Vis 60:70–93
Danielyan A, Katkovnik V, Egiazarian K (2011) BM3D frames and variational image deblurring. IEEE Trans Image Process 21(4):1715–1728
Buades A, Coll B, Morel J-M (2011) Non-local means denoising. Image Process On Line 1:208–212
Takano N, Alaghband G (2019) Srgan: Training dataset matters. ar**v preprint ar**v:1903.09922
Dong C, Loy CC, Tang X (2016) Accelerating the super-resolution convolutional neural network. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part II 14. Springer International Publishing, pp 391–407
Niklaus S, Mai L, Liu F (2017) Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 670–679
Nah S, Son S, Lee KM (2019) Recurrent neural networks with intra-frame iterations for video deblurring. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8102–8111
Singh G, Jaggi N, Vasamsetti S, Sardana HK, Kumar S, Mittal N (2015) Underwater image/video enhancement using wavelet based color correction (WBCC) method. In: 2015 IEEE Underwater Technology (UT). IEEE, pp 1–5
Mameli F, Bertini M, Galteri L, Del Bimbo A (2021) A NoGAN approach for image and video restoration and compression artifact removal. In: 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, pp 9326–9332
Dai Y, Liu D, Wu F (2017) A convolutional neural network approach for post-processing in HEVC intra coding. In: MultiMedia Modeling: 23rd International Conference, MMM 2017, Reykjavik, Iceland, January 4–6, 2017, Proceedings, Part I 23. Springer International Publishing, pp 28–39
Wang T, Chen M, Chao H (2017) A novel deep learning-based method of improving coding efficiency from the decoder-end for HEVC. In: 2017 Data Compression Conference (DCC). IEEE, pp 410–419
Brandi F, de Queiroz R, Mukherjee D (2008) Super-resolution of video using key frames and motion estimation. In: 2008 15th IEEE International Conference on Image Processing. IEEE, pp 321–324
Song BC, Jeong S-C, Choi Y (2010) Video super-resolution algorithm using bi-directional overlapped block motion compensation and on-the-fly dictionary training. IEEE Trans Circuits Syst Video Technol 21(3):274–285
Huang Y, Wang W, Wang L (2015) Bidirectional recurrent convolutional networks for multi-frame super-resolution. Adv Neural Inf Process Syst 28
Kappeler A, Yoo S, Dai Q, Katsaggelos AK (2016) Video super-resolution with convolutional neural networks. IEEE Trans Comput Imaging 2(2):109–122
Meng X, Deng X, Zhu S, Zhang X, Zeng B (2020) A robust quality enhancement method based on joint spatial-temporal priors for video coding. IEEE Trans Circuits Syst Video Technol 31(6):2401–2414
He K, Sun J, Tang X (2012) Guided image filtering. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409
Li Z, Zheng J, Zhu Z, Yao W, Shiqian Wu (2014) Weighted guided image filtering. IEEE Trans Image Process 24(1):120–129
Sun Z, Han Bo, Li J, Zhang J, Gao X (2019) Weighted guided image filtering with steering kernel. IEEE Trans Image Process 29:500–508
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection–a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545
Zhao H, Gallo O, Frosio I, Kautz J (2016) Loss functions for image restoration with neural networks. IEEE Trans Comput Imaging 3(1):47–57
Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. ar**v preprint ar**v:1511.05440
http://www.cse.cuhk.edu.hk/leojia/projects/DeepSR/. Accessed 2 Feb 2023
Luo D, Ye M, Li S, Li X (2022) Coarse-to-fine spatio-temporal information fusion for compressed video quality enhancement. IEEE Signal Process Lett 29:543–547
Liu J, Zhou M, **ao M (2022) Deformable convolution dense network for compressed video quality enhancement. In: Proc IEEE Int Conf Acoust Speech Signal Process (ICASSP), pp 1930–1934
Huang Z, Sun J, Guo X (2023) FastCNN: Towards fast and accurate spatiotemporal network for HEVC compressed video enhancement. ACM Trans Multimed Comput Commun Appl 19(3):1–22
Yu L, Chang W, Wu S, Gabbouj M (2023) End-to-end transformer for compressed video quality enhancement. IEEE Trans Broadcasting 70(1):97–207
Funding
No.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethics approval
Yes.
Consent to participate
Yes.
Consent for publication
Yes.
Conflicts of interest
No.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chourasia, S., Patel, P. & Jain, P.K. Fusing steering kernel guided filtering with U-NET ConvLSTM for elevated video quality enhancement. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19636-4
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19636-4