Log in

Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Despite significant progress has been made in image and video deblurring, much less attention has been paid to process ultra high-definition (UHD) videos (e.g., 4K resolution). In this work, we propose a novel deep model for fast and accurate UHD video deblurring (UHDVD). The proposed UHDVD is achieved by a depth-wise separable-patch architecture, which operates with a multi-scale integration scheme to achieve a large receptive field without adding the number of generic convolutional layers and kernels. Additionally, we adopt the temporal feature attention module to effectively exploit the temporal correlation between video frames to obtain clearer recovered images. We design an asymmetrical encoder–decoder architecture with residual channel-spatial attention blocks to improve accuracy and reduce the depth of the network appropriately. Consequently, the proposed UHDVD achieves real-time performance on 4K videos at 30 fps. To train the proposed model, we build a new dataset comprised of 4K blurry videos and corresponding sharp frames using three different smartphones. Extensive experimental results show that our network performs favorably against the state-of-the-art methods on the proposed 4K dataset and existing 720p and 2K benchmarks in terms of accuracy, speed, and model size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. 4KRD site: https://drive.google.com/drive/folders/19bjJLMgQkwIAQaZYvsUhEVaxzJQFwhHF?usp=sharing

References

  • Bar, L., Berkels, B., Rumpf, M., & Sapiro, G. (2007). A variational framework for simultaneous motion estimation and restoration of motion-blurred video. In IEEE international conference on computer vision.

  • Chen, L., Fang, F., Wang, T., & Zhang, G. (2019). Blind image deblurring with local maximum gradient prior. In IEEE conference on computer vision and pattern recognition.

  • Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. ar**v preprint ar**v:1706.05587

  • Cho, S., Matsushita, Y., & Lee, S. (2007). Removing non-uniform motion blur from images. In IEEE international conference on computer vision.

  • Deng, S., Ren, W., Yan, Y., Wang, T., Song, F., & Cao, X. (2021). Multi-scale separable network for ultra-high-definition video deblurring. In IEEE international conference on computer vision (pp. 14030–14039).

  • Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Transactions on Image Processing, 20(7), 1838–1857.

    Article  MathSciNet  Google Scholar 

  • Gao, H., Tao, X., Shen, X., & Jia, J. (2019). Dynamic scene deblurring with parameter selective sharing and nested skip connections. In IEEE conference on computer vision and pattern recognition.

  • Gong, D., Yang, J., Liu, L., Zhang, Y., Reid, I., Shen, C., Van Den Hengel, A., & Shi, Q. (2017). From motion blur to motion flow: A deep learning solution for removing heterogeneous motion blur. In IEEE conference on computer vision and pattern recognition (pp. 2319–2328).

  • Hu, X., Ren, W., Yu, K., Zhang, K., Cao, X., Liu, W., & Menze, B. (2021). Pyramid architecture search for real-time image deblurring. In IEEE international conference on computer vision.

  • Hu, Z., & Yang, M. H. (2015). Learning good regions to deblur images. International Journal of Computer Vision, 115(3), 66.

    Article  MathSciNet  Google Scholar 

  • Hyun Kim, T., Ahn, B., & Mu Lee, K. (2013). Dynamic scene deblurring. In IEEE international conference on computer vision (pp. 3160–3167).

  • Hyun Kim, T., & Mu Lee, K. (2014). Segmentation-free dynamic scene deblurring. InIEEE conference on computer vision and pattern recognition (pp. 2766–2773).

  • Hyun Kim, T., Mu Lee, K., Scholkopf, B., & Hirsch, M. (2017). Online video deblurring via dynamic temporal blending network. In IEEE international conference on computer vision (pp. 4038–4047).

  • Janai, J., Guney, F., Wulff, J., Black, M. J., & Geiger, A. (2017). Slow flow: Exploiting high-speed cameras for accurate and diverse optical flow reference data. In IEEE conference on computer vision and pattern recognition.

  • Ji, H., & Wang, K. (2012). A two-stage approach to blind spatially-varying motion deblurring. In IEEE conference on computer vision and pattern recognition.

  • Jiang, Z., Zhang, Y., Zou, D., Ren, J., Lv, J., & Liu, Y. (2020). Learning event-based motion deblurring. In IEEE conference on computer vision and pattern recognition.

  • Keskar, N. S., Mudigere, D., Nocedal, J., Smelyanskiy, M., & Tang, P. T. P. (2016). On large-batch training for deep learning: Generalization gap and sharp minima. In International conference on learning representations.

  • Kim, S. Y., Oh, J., & Kim, M. (2019). Deep sr-itm: Joint learning of super-resolution and inverse tone-map** for 4k uhd hdr applications. In IEEE international conference on computer vision.

  • Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. ar**v preprint ar**v:1412.6980.

  • Krishnan, D., Tay, T., & Fergus, R. (2011). Blind deconvolution using a normalized sparsity measure. In IEEE conference on computer vision and pattern recognition.

  • Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., & Matas, J. (2018). Deblurgan: Blind motion deblurring using conditional adversarial networks. In IEEE conference on computer vision and pattern recognition (pp. 8183–8192).

  • Kupyn, O., Martyniuk, T., Wu, J., & Wang, Z. (2019). Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better. In IEEE international conference on computer vision.

  • Lai, W. S., Ding, J. J., Lin, Y. Y., & Chuang, Y. Y. (2015). Blur kernel estimation using normalized color-line prior. In IEEE conference on computer vision and pattern recognition.

  • Li, L., Pan, J., Lai, W.S., Gao, C., Sang, N., & Yang, M. H. (2018). Learning a discriminative prior for blind image deblurring. In IEEE conference on computer vision and pattern recognition.

  • Liu, Z., Yeh, R. A., Tang, X., Liu, Y., & Agarwala, A. (2017). Video frame synthesis using deep voxel flow. In IEEE international conference on computer vision (pp. 4463–4471).

  • Michaeli, T., & Irani, M. (2014). Blind deblurring using internal patch recurrence. In European conference on computer vision.

  • Nah, S., Baik, S., Hong, S., Moon, G., Son, S., Timofte, R., & Mu Lee, K. (2019). Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study. In IEEE conference on computer vision and pattern recognition workshops.

  • Nah, S., Hyun Kim, T., & Mu Lee, K. (2017). Deep multi-scale convolutional neural network for dynamic scene deblurring. In IEEE conference on computer vision and pattern recognition (pp. 3883–3891).

  • Nah, S., Son, S., & Lee, K. M. (2019). Recurrent neural networks with intra-frame iterations for video deblurring. In IEEE conference on computer vision and pattern recognition (pp. 8102—8111).

  • Nah, S., Timofte, R., Baik, S., Hong, S., Moon, G., Son, S., & Mu Lee, K. (2019). Ntire 2019 challenge on video deblurring: Methods and results. In IEEE conference on computer vision and pattern recognition workshops.

  • Nan, Y., Quan, Y., & Ji, H. (2020). Variational-em-based deep learning for noise-blind image deblurring. In IEEE conference on computer vision and pattern recognition.

  • Niklaus, S., Mai, L., & Liu, F. (2017). Video frame interpolation via adaptive convolution. In IEEE conference on computer vision and pattern recognition (pp. 670–679).

  • Pan, J., Bai, H., & Tang, J. (2020). Cascaded deep video deblurring using temporal sharpness prior. In IEEE conference on computer vision and pattern recognition (pp. 3043–3051).

  • Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, Ł., Shazeer, N., Ku, A., & Tran, D. (2018). Image transformer. ar**v preprint ar**v:1802.05751

  • Perrone, D., & Favaro, P. (2014). Total variation blind deconvolution: The devil is in the details. In IEEE conference on computer vision and pattern recognition.

  • Ren, W., Pan, J., Cao, X., & Yang, M. H. (2017). Video deblurring via semantic segmentation and pixel-wise non-linear kernel. In IEEE international conference on computer vision (pp. 1077–1085).

  • Schuler, C. J., Hirsch, M., Harmeling, S., & Schölkopf, B. (2015). Learning to deblur. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(7), 1439–1451.

    Article  Google Scholar 

  • Shan, Q., Jia, J., & Agarwala, A. (2008). High-quality motion deblurring from a single image. ACM Transactions on Graphics, 27(3), 1–10.

    Article  Google Scholar 

  • Su, S., Delbracio, M., Wang, J., Sapiro, G., Heidrich, W., & Wang, O. (2017). Deep video deblurring for hand-held cameras. In IEEE conference on computer vision and pattern recognition (pp. 1279–1288).

  • Suin, M., Purohit, K., & Rajagopalan, A. (2020). Spatially-attentive patch-hierarchical network for adaptive motion deblurring. In IEEE conference on computer vision and pattern recognition (pp. 3606–3615).

  • Sun, J., Cao, W., Xu, Z., & Ponce, J. (2015). Learning a convolutional neural network for non-uniform motion blur removal. In IEEE conference on computer vision and pattern recognition (pp. 769–777).

  • Sun, L., Cho, S., Wang, J., & Hays, J. (2013). Edge-based blur kernel estimation using patch priors. In IEEE international conference on computational photography.

  • Tao, X., Gao, H., Liao, R., Wang, J., & Jia, J. (2017). Detail-revealing deep video super-resolution. In IEEE international conference on computer vision (pp. 4472–4480).

  • Tao, X., Gao, H., Shen, X., Wang, J., & Jia, J. (2018). Scale-recurrent network for deep image deblurring. In IEEE conference on computer vision and pattern recognition (pp. 8174–8182).

  • Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., & Cottrell, G. (2018). Understanding convolution for semantic segmentation. In IEEE winter conference on applications of computer vision (pp. 1451–1460).

  • Wang, X., Chan, K. C., Yu, K., Dong, C., & Change Loy, C. (2019). Edvr: Video restoration with enhanced deformable convolutional networks. In IEEE conference on computer vision and pattern recognition workshops.

  • Wieschollek, P., Hirsch, M., Scholkopf, B., & Lensch, H. P. A. (2017). Learning blind motion deblurring. In IEEE international conference on computer vision.

  • Wulff, J., & Black, M. J. (2014). Modeling blurred video with layers. In European conference on computer vision (pp. 236–252).

  • Xu, L., Zheng, S., & Jia, J. (2013). Unnatural l0 sparse representation for natural image deblurring. In IEEE conference on computer vision and pattern recognition.

  • Yu, F., & Koltun, V. (2015). Multi-scale context aggregation by dilated convolutions. ar**v preprint ar**v:1511.07122

  • Zamir, S. W., Arora, A., Khan, S., Hayat, M., Khan, F. S., Yang, M. H., & Shao, L. (2021). Multi-stage progressive image restoration. ar**v preprint ar**v:2102.02808

  • Zeng, H., Cai, J., Li, L., Cao, Z., & Zhang, L. (2020). Learning image-adaptive 3d lookup tables for high performance photo enhancement in real-time. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 66.

    Google Scholar 

  • Zhang, H., Dai, Y., Li, H., & Koniusz, P. (2019). Deep stacked hierarchical multi-patch network for image deblurring. In IEEE conference on computer vision and pattern recognition (pp. 5978–5986).

  • Zhang, J., Pan, J., Ren, J., Song, Y., Bao, L., Lau, R. W., & Yang, M. H. (2018). Dynamic scene deblurring using spatially variant recurrent neural networks. In IEEE conference on computer vision and pattern recognition.

  • Zhang, K., Luo, W., Zhong, Y., Ma, L., Liu, W., & Li, H. (2018). Adversarial spatio-temporal learning for video deblurring. IEEE Transactions on Image Processing, 28(1), 291–301.

    Article  MathSciNet  Google Scholar 

  • Zhang, K., Luo, W., Zhong, Y., Ma, L., Stenger, B., Liu, W., & Li, H. (2020). Deblurring by realistic blurring. In IEEE conference on computer vision and pattern recognition (pp. 2737–2746).

  • Zhang, K., Zuo, W., & Zhang, L. (2019). Deep plug-and-play super-resolution for arbitrary blur kernels. In IEEE conference on computer vision and pattern recognition.

  • Zhao, Z., **ong, B., Gai, S., & Wang, L. (2020). Improved deep multi-patch hierarchical network with nested module for dynamic scene deblurring. IEEE Access, 8, 62116–62126.

  • Zhong, Z., Gao, Y., Zheng, Y., & Zheng, B. (2020). Efficient spatio-temporal recurrent neural network for video deblurring. In European conference on computer vision (pp. 191–207). Springer.

  • Zhou, S., Zhang, J., Pan, J., **e, H., Zuo, W., & Ren, J. (2019). Spatio-temporal filter adaptive network for video deblurring. In IEEE international conference on computer vision (pp. 2482–2491).

  • Zhou, Y., & Komodakis, N. (2014). A map-estimation framework for blind deblurring using high-level edge priors. In European conference on computer vision.

Download references

Acknowledgements

This research was funded in part by National Natural Science Foundation of China (NSFC) under Grant 62322216 and 62172409, and funded in part by Shenzhen Science and Technology Program (Grant No. RCYX20221008092849068, No. JCYJ20220818102012025, No. JCYJ20220530145209022).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ming-Hsuan Yang.

Additional information

Communicated by Chen Change Loy.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ren, W., Deng, S., Zhang, K. et al. Fast Ultra High-Definition Video Deblurring via Multi-scale Separable Network. Int J Comput Vis 132, 1817–1834 (2024). https://doi.org/10.1007/s11263-023-01958-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-023-01958-9

Keywords

Navigation