Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network

Ren, Wenqi; Deng, Senyou; Zhang, Kaihao; Song, Fenglong; Cao, **aochun; Yang, Ming-Hsuan

doi:10.1007/s11263-023-01958-9

Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network

Published: 11 December 2023

Volume 132, pages 1817–1834, (2024)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Wenqi Ren^1,2,
Senyou Deng¹,
Kaihao Zhang³,
Fenglong Song⁴,
**aochun Cao¹ &
…
Ming-Hsuan Yang ORCID: orcid.org/0000-0003-4848-2304⁵

663 Accesses
1 Altmetric
Explore all metrics

Abstract

Despite significant progress has been made in image and video deblurring, much less attention has been paid to process ultra high-definition (UHD) videos (e.g., 4K resolution). In this work, we propose a novel deep model for fast and accurate UHD video deblurring (UHDVD). The proposed UHDVD is achieved by a depth-wise separable-patch architecture, which operates with a multi-scale integration scheme to achieve a large receptive field without adding the number of generic convolutional layers and kernels. Additionally, we adopt the temporal feature attention module to effectively exploit the temporal correlation between video frames to obtain clearer recovered images. We design an asymmetrical encoder–decoder architecture with residual channel-spatial attention blocks to improve accuracy and reduce the depth of the network appropriately. Consequently, the proposed UHDVD achieves real-time performance on 4K videos at 30 fps. To train the proposed model, we build a new dataset comprised of 4K blurry videos and corresponding sharp frames using three different smartphones. Extensive experimental results show that our network performs favorably against the state-of-the-art methods on the proposed 4K dataset and existing 720p and 2K benchmarks in terms of accuracy, speed, and model size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 5

Learning a Deep Convolutional Network for Image Super-Resolution

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

DMFNet: deep matrix factorization network for image compressed sensing

Article 26 June 2024

Notes

4KRD site: https://drive.google.com/drive/folders/19bjJLMgQkwIAQaZYvsUhEVaxzJQFwhHF?usp=sharing

References

Bar, L., Berkels, B., Rumpf, M., & Sapiro, G. (2007). A variational framework for simultaneous motion estimation and restoration of motion-blurred video. In IEEE international conference on computer vision.
Chen, L., Fang, F., Wang, T., & Zhang, G. (2019). Blind image deblurring with local maximum gradient prior. In IEEE conference on computer vision and pattern recognition.
Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. ar**v preprint ar**v:1706.05587
Cho, S., Matsushita, Y., & Lee, S. (2007). Removing non-uniform motion blur from images. In IEEE international conference on computer vision.
Deng, S., Ren, W., Yan, Y., Wang, T., Song, F., & Cao, X. (2021). Multi-scale separable network for ultra-high-definition video deblurring. In IEEE international conference on computer vision (pp. 14030–14039).
Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Transactions on Image Processing, 20(7), 1838–1857.
Article MathSciNet Google Scholar
Gao, H., Tao, X., Shen, X., & Jia, J. (2019). Dynamic scene deblurring with parameter selective sharing and nested skip connections. In IEEE conference on computer vision and pattern recognition.
Gong, D., Yang, J., Liu, L., Zhang, Y., Reid, I., Shen, C., Van Den Hengel, A., & Shi, Q. (2017). From motion blur to motion flow: A deep learning solution for removing heterogeneous motion blur. In IEEE conference on computer vision and pattern recognition (pp. 2319–2328).
Hu, X., Ren, W., Yu, K., Zhang, K., Cao, X., Liu, W., & Menze, B. (2021). Pyramid architecture search for real-time image deblurring. In IEEE international conference on computer vision.
Hu, Z., & Yang, M. H. (2015). Learning good regions to deblur images. International Journal of Computer Vision, 115(3), 66.
Article MathSciNet Google Scholar
Hyun Kim, T., Ahn, B., & Mu Lee, K. (2013). Dynamic scene deblurring. In IEEE international conference on computer vision (pp. 3160–3167).
Hyun Kim, T., & Mu Lee, K. (2014). Segmentation-free dynamic scene deblurring. InIEEE conference on computer vision and pattern recognition (pp. 2766–2773).
Hyun Kim, T., Mu Lee, K., Scholkopf, B., & Hirsch, M. (2017). Online video deblurring via dynamic temporal blending network. In IEEE international conference on computer vision (pp. 4038–4047).
Janai, J., Guney, F., Wulff, J., Black, M. J., & Geiger, A. (2017). Slow flow: Exploiting high-speed cameras for accurate and diverse optical flow reference data. In IEEE conference on computer vision and pattern recognition.
Ji, H., & Wang, K. (2012). A two-stage approach to blind spatially-varying motion deblurring. In IEEE conference on computer vision and pattern recognition.
Jiang, Z., Zhang, Y., Zou, D., Ren, J., Lv, J., & Liu, Y. (2020). Learning event-based motion deblurring. In IEEE conference on computer vision and pattern recognition.
Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., & Tang, P. T. P. (2016). On large-batch training for deep learning: Generalization gap and sharp minima. In International conference on learning representations.
Kim, S. Y., Oh, J., & Kim, M. (2019). Deep sr-itm: Joint learning of super-resolution and inverse tone-map** for 4k uhd hdr applications. In IEEE international conference on computer vision.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. ar**v preprint ar**v:1412.6980.
Krishnan, D., Tay, T., & Fergus, R. (2011). Blind deconvolution using a normalized sparsity measure. In IEEE conference on computer vision and pattern recognition.
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., & Matas, J. (2018). Deblurgan: Blind motion deblurring using conditional adversarial networks. In IEEE conference on computer vision and pattern recognition (pp. 8183–8192).
Kupyn, O., Martyniuk, T., Wu, J., & Wang, Z. (2019). Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In IEEE international conference on computer vision.
Lai, W. S., Ding, J. J., Lin, Y. Y., & Chuang, Y. Y. (2015). Blur kernel estimation using normalized color-line prior. In IEEE conference on computer vision and pattern recognition.
Li, L., Pan, J., Lai, W.S., Gao, C., Sang, N., & Yang, M. H. (2018). Learning a discriminative prior for blind image deblurring. In IEEE conference on computer vision and pattern recognition.
Liu, Z., Yeh, R. A., Tang, X., Liu, Y., & Agarwala, A. (2017). Video frame synthesis using deep voxel flow. In IEEE international conference on computer vision (pp. 4463–4471).
Michaeli, T., & Irani, M. (2014). Blind deblurring using internal patch recurrence. In European conference on computer vision.
Nah, S., Baik, S., Hong, S., Moon, G., Son, S., Timofte, R., & Mu Lee, K. (2019). Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In IEEE conference on computer vision and pattern recognition workshops.
Nah, S., Hyun Kim, T., & Mu Lee, K. (2017). Deep multi-scale convolutional neural network for dynamic scene deblurring. In IEEE conference on computer vision and pattern recognition (pp. 3883–3891).
Nah, S., Son, S., & Lee, K. M. (2019). Recurrent neural networks with intra-frame iterations for video deblurring. In IEEE conference on computer vision and pattern recognition (pp. 8102—8111).
Nah, S., Timofte, R., Baik, S., Hong, S., Moon, G., Son, S., & Mu Lee, K. (2019). Ntire 2019 challenge on video deblurring: Methods and results. In IEEE conference on computer vision and pattern recognition workshops.
Nan, Y., Quan, Y., & Ji, H. (2020). Variational-em-based deep learning for noise-blind image deblurring. In IEEE conference on computer vision and pattern recognition.
Niklaus, S., Mai, L., & Liu, F. (2017). Video frame interpolation via adaptive convolution. In IEEE conference on computer vision and pattern recognition (pp. 670–679).
Pan, J., Bai, H., & Tang, J. (2020). Cascaded deep video deblurring using temporal sharpness prior. In IEEE conference on computer vision and pattern recognition (pp. 3043–3051).
Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, Ł., Shazeer, N., Ku, A., & Tran, D. (2018). Image transformer. ar**v preprint ar**v:1802.05751
Perrone, D., & Favaro, P. (2014). Total variation blind deconvolution: The devil is in the details. In IEEE conference on computer vision and pattern recognition.
Ren, W., Pan, J., Cao, X., & Yang, M. H. (2017). Video deblurring via semantic segmentation and pixel-wise non-linear kernel. In IEEE international conference on computer vision (pp. 1077–1085).
Schuler, C. J., Hirsch, M., Harmeling, S., & Schölkopf, B. (2015). Learning to deblur. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(7), 1439–1451.
Article Google Scholar
Shan, Q., Jia, J., & Agarwala, A. (2008). High-quality motion deblurring from a single image. ACM Transactions on Graphics, 27(3), 1–10.
Article Google Scholar
Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., & Wang, O. (2017). Deep video deblurring for hand-held cameras. In IEEE conference on computer vision and pattern recognition (pp. 1279–1288).
Suin, M., Purohit, K., & Rajagopalan, A. (2020). Spatially-attentive patch-hierarchical network for adaptive motion deblurring. In IEEE conference on computer vision and pattern recognition (pp. 3606–3615).
Sun, J., Cao, W., Xu, Z., & Ponce, J. (2015). Learning a convolutional neural network for non-uniform motion blur removal. In IEEE conference on computer vision and pattern recognition (pp. 769–777).
Sun, L., Cho, S., Wang, J., & Hays, J. (2013). Edge-based blur kernel estimation using patch priors. In IEEE international conference on computational photography.
Tao, X., Gao, H., Liao, R., Wang, J., & Jia, J. (2017). Detail-revealing deep video super-resolution. In IEEE international conference on computer vision (pp. 4472–4480).
Tao, X., Gao, H., Shen, X., Wang, J., & Jia, J. (2018). Scale-recurrent network for deep image deblurring. In IEEE conference on computer vision and pattern recognition (pp. 8174–8182).
Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., & Cottrell, G. (2018). Understanding convolution for semantic segmentation. In IEEE winter conference on applications of computer vision (pp. 1451–1460).
Wang, X., Chan, K. C., Yu, K., Dong, C., & Change Loy, C. (2019). Edvr: Video restoration with enhanced deformable convolutional networks. In IEEE conference on computer vision and pattern recognition workshops.
Wieschollek, P., Hirsch, M., Scholkopf, B., & Lensch, H. P. A. (2017). Learning blind motion deblurring. In IEEE international conference on computer vision.
Wulff, J., & Black, M. J. (2014). Modeling blurred video with layers. In European conference on computer vision (pp. 236–252).
Xu, L., Zheng, S., & Jia, J. (2013). Unnatural l0 sparse representation for natural image deblurring. In IEEE conference on computer vision and pattern recognition.
Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. ar**v preprint ar**v:1511.07122
Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., & Shao, L. (2021). Multi-stage progressive image restoration. ar**v preprint ar**v:2102.02808
Zeng, H., Cai, J., Li, L., Cao, Z., & Zhang, L. (2020). Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 66.
Google Scholar
Zhang, H., Dai, Y., Li, H., & Koniusz, P. (2019). Deep stacked hierarchical multi-patch network for image deblurring. In IEEE conference on computer vision and pattern recognition (pp. 5978–5986).
Zhang, J., Pan, J., Ren, J., Song, Y., Bao, L., Lau, R. W., & Yang, M. H. (2018). Dynamic scene deblurring using spatially variant recurrent neural networks. In IEEE conference on computer vision and pattern recognition.
Zhang, K., Luo, W., Zhong, Y., Ma, L., Liu, W., & Li, H. (2018). Adversarial spatio-temporal learning for video deblurring. IEEE Transactions on Image Processing, 28(1), 291–301.
Article MathSciNet Google Scholar
Zhang, K., Luo, W., Zhong, Y., Ma, L., Stenger, B., Liu, W., & Li, H. (2020). Deblurring by realistic blurring. In IEEE conference on computer vision and pattern recognition (pp. 2737–2746).
Zhang, K., Zuo, W., & Zhang, L. (2019). Deep plug-and-play super-resolution for arbitrary blur kernels. In IEEE conference on computer vision and pattern recognition.
Zhao, Z., **ong, B., Gai, S., & Wang, L. (2020). Improved deep multi-patch hierarchical network with nested module for dynamic scene deblurring. IEEE Access, 8, 62116–62126.
Zhong, Z., Gao, Y., Zheng, Y., & Zheng, B. (2020). Efficient spatio-temporal recurrent neural network for video deblurring. In European conference on computer vision (pp. 191–207). Springer.
Zhou, S., Zhang, J., Pan, J., **e, H., Zuo, W., & Ren, J. (2019). Spatio-temporal filter adaptive network for video deblurring. In IEEE international conference on computer vision (pp. 2482–2491).
Zhou, Y., & Komodakis, N. (2014). A map-estimation framework for blind deblurring using high-level edge priors. In European conference on computer vision.

Download references

Acknowledgements

This research was funded in part by National Natural Science Foundation of China (NSFC) under Grant 62322216 and 62172409, and funded in part by Shenzhen Science and Technology Program (Grant No. RCYX20221008092849068, No. JCYJ20220818102012025, No. JCYJ20220530145209022).

Author information

Authors and Affiliations

School of Cyber Science and Technology, Shenzhen Campus, Sun Yat-sen University, 518107, Shenzhen, China
Wenqi Ren, Senyou Deng & **aochun Cao
Key Laboratory of Education Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming, 650500, China
Wenqi Ren
Department of Computer Science, Australian National University (ANU), Pok Fu Lam, Hong Kong, China
Kaihao Zhang
Huawei Noah’s Ark Lab, Bei**g, 100196, China
Fenglong Song
School of Engineering, University of California at Merced, Merced, CA, 95343, USA
Ming-Hsuan Yang

Authors

Wenqi Ren
View author publications
You can also search for this author in PubMed Google Scholar
Senyou Deng
View author publications
You can also search for this author in PubMed Google Scholar
Kaihao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Fenglong Song
View author publications
You can also search for this author in PubMed Google Scholar
**aochun Cao
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Hsuan Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming-Hsuan Yang.

Additional information

Communicated by Chen Change Loy.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ren, W., Deng, S., Zhang, K. et al. Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network. Int J Comput Vis 132, 1817–1834 (2024). https://doi.org/10.1007/s11263-023-01958-9

Download citation

Received: 19 December 2022
Accepted: 05 November 2023
Published: 11 December 2023
Issue Date: May 2024
DOI: https://doi.org/10.1007/s11263-023-01958-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning a Deep Convolutional Network for Image Super-Resolution

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

DMFNet: deep matrix factorization network for image compressed sensing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning a Deep Convolutional Network for Image Super-Resolution

Deep Learning Techniques—R-CNN to Mask R-CNN: A Survey

DMFNet: deep matrix factorization network for image compressed sensing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation