Hybrid War** Fusion for Video Frame Interpolation

Li, Yu; Zhu, Ye; Li, Ruoteng; Wang, **ntao; Luo, Yue; Shan, Ying

doi:10.1007/s11263-022-01683-9

Hybrid War** Fusion for Video Frame Interpolation

Published: 23 September 2022

Volume 130, pages 2980–2993, (2022)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yu Li ORCID: orcid.org/0000-0003-1865-8276¹^na1,
Ye Zhu²^na1,
Ruoteng Li³,
**ntao Wang²,
Yue Luo² &
…
Ying Shan²

821 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Video frame interpolation aims to synthesize new intermediate frames between existing ones, which is an important task in video enhancement. A classic direction in this field is flow-based which estimates motions in the form of optical flow, warps the frames, and synthesizes the final results. In this work, we explicitly investigate the war** step and propose a way to combine the strength from using both forward and backward war**. Our method, named HWFI, introduces hybrid war** fusion for frame interpolation. We also include edge information explicitly in our pipeline and employ channel attention in our synthesis network. Compared to the latest state-of-the-art method that only uses forward war**, our method produces better results with higher quality, especially in edge regions. Extensive experiments show that our method can obtain the best results qualitatively and quantitatively on multiple benchmark datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Flow-aware synthesis: A generic motion model for video frame interpolation

Article Open access 17 March 2021

A Multi-frame Video Interpolation Neural Network for Large Motion

MVFI-Net: Motion-Aware Video Frame Interpolation Network

References

Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M. J., & Szeliski, R. (2011). A database and evaluation methodology for optical flow. Int. J. Comput. Vision, 92, 11–31.
Article Google Scholar
Bao, W., Lai, W. S., Ma, C., Zhang, X., Gao, Z., & Yang, M. H. (2019). Depth-aware video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3703–3712).
Bao, W., Lai, W. S., Zhang, X., Gao, Z., & Yang, M. H. (2019). MEMC-Net: Motion estimation and motion compensation driven neural network for video interpolation and enhancement. IEEE Trans. Pattern Anal. Mach. Intell., 43(3), 933–948.
Article Google Scholar
Bojanowski, P., Joulin, A., Lopez-Pas, D., & Szlam, A. (2018). Optimizing the latent space of generative networks. In International conference on machine learning (pp. 600–609).
Cheng, X., & Chen, Z. (2020). Video frame interpolation via deformable separable convolution. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10607–10614.
Article Google Scholar
Cheng, X., & Chen, Z. (2021). Multiple video frame interpolation via enhanced deformable separable convolution. IEEE Transactions on Pattern Analysis and Machine Intelligence (01) 1.
Chi, Z., Mohammadi Nasiri, R., Liu, Z., Lu, J., Tang, J., & Plataniotis, K. N. (2020). All at once: Temporally adaptive multi-frame interpolation with advanced motion modeling. In European conference on computer vision (pp. 107–123).
Choi, M., Choi, J., Baik, S., Kim, T. H., & Lee, K. M. (2020). Scene-adaptive video frame interpolation via meta-learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 9444–9453).
Choi, M., Kim, H., Han, B., Xu, N., & Lee, K. M. (2020). Channel attention is all you need for video frame interpolation. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10663–10671.
Article Google Scholar
Ding, T., Liang, L., Zhu, Z., & Zharkov, I. (2021). Cdfi: Compression-driven network design for frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8001–8011).
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas, C., Golkov, V., & Brox, T. (2015). Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 2758–2766).
Fourure, D., Emonet, R., Fromont, E., Muselet, D., Tremeau, A., & Wolf, C. (2017). Residual conv-deconv grid network for semantic segmentation. In Proceedings of the British machine vision conference (pp. 181.1-181.13).
Gu, D., Wen, Z., Cui, W., Wang, R., Jiang, F., & Liu, S. (2019). Continuous bidirectional optical flow for video frame sequence interpolation. In IEEE international conference on multimedia and expo (pp. 1768–1773).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., & Brox, T. (2017). Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2462–2470).
Jiang, H., Sun, D., Jampani, V., Yang, M. H., Learned-Miller, E., & Kautz, J. (2018). Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 9000–9008).
Kalantari, N. K., Wang, T. C., & Ramamoorthi, R. (2016). Learning-based view synthesis for light field cameras. ACM Transactions on Graphics, 35(6), 1–10.
Article Google Scholar
Kang, J., Jo, Y., Oh, S. W., Vajda, P., & Kim, S. J. (2020). Deep space-time video upsampling networks. In European conference on computer vision (pp. 701–717).
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. In International conference on learning representations.
Lee, H., Kim, T., Chung, T y., Pak, D., Ban, Y., & Lee, S. (2020). Adacof: Adaptive collaboration of flows for video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5316–5325).
Lee, S., Choi, N., & Choi, W. I. (2022). Enhanced correlation matching based video frame interpolation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 2839–2847).
Li, H., Yuan, Y., & Wang, Q. (2020). Video frame interpolation via residue refinement. In IEEE international conference on acoustics, speech and signal processing (pp. 2613–2617).
Liu, Y., **e, L., Siyao, L., Sun, W., Qiao, Y., & Dong, C. (2020). Enhanced quadratic video interpolation. In European conference on computer vision (pp. 41–56).
Liu, Y. L., Liao, Y. T., Lin, Y. Y., & Chuang, Y. Y. (2019). Deep video frame interpolation using cyclic frame generation. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8794–8802.
Article Google Scholar
Liu, Z., Yeh, R. A., Tang, X., Liu, Y., & Agarwala, A. (2017). Video frame synthesis using deep voxel flow. In Proceedings of the IEEE international conference on computer vision (pp. 4463–4471).
Long, G., Kneip, L., Li, X., Zhang, X., & Yu, Q. (2015). Simplified mirror-based camera pose computation via rotation averaging. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1247–1255).
Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., & Schroers, C. (2018). Phasenet for video frame interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 498–507).
Meyer, S., Wang, O., Zimmer, H., Grosse, M., & Sorkine-Hornung, A. (2015). Phase-based frame interpolation for video. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1410–1418).
Nguyen-Phuoc, T. H., Li, C., Balaban, S., & Yang, Y. (2018). Rendernet: A deep convolutional network for differentiable rendering from 3d shapes. Adv. Neural. Inf. Process. Syst., 31, 7902–7912.
Google Scholar
Niklaus, S., & Liu, F. (2018). Context-aware synthesis for video frame interpolation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701–1710).
Niklaus, S., & Liu, F. (2020). Softmax splatting for video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5437–5446).
Niklaus, S., Mai, L., & Liu, F. (2017a). Video frame interpolation via adaptive convolution Video frame interpolation via adaptive convolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 670–679).
Niklaus, S., Mai, L., & Liu, F. (2017b). Video frame interpolation via adaptive separable convolution. In Proceedings of the IEEE international conference on computer vision (pp. 261–270).
Niklaus, S., Mai, L., & Wang, O. (2021). Revisiting adaptive convolutions for video frame interpolation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 1099–1109).
Park, J., Ko, K., Lee, C., & Kim, C S. (2020). Bmbc: Bilateral motion estimation with bilateral cost volume for video interpolation. In European conference on computer vision (pp. 109–125).
Peleg, T., Szekely, P., Sabo, D., & Sendik, O. (2019). Im-net for high resolution video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern Recognition (pp. 2398–2407).
Reda, F. A., Liu, G., Shih, K. J., Kirby, R., Barker, J., Tarjan, D., & Catanzaro, B. (2018). Sdc-net: Video prediction using spatially-displaced convolution. In European conference on computer vision (pp. 718–733).
Reda, F. A., Sun, D., Dundar, A., Shoeybi, M., Liu, G., Shih, K. J., & Catanzaro, B. (2019). Unsupervised video interpolation using cycle consistency. In Proceedings of the IEEE/CVF international conference on computer Vision (pp. 892–900).
Shen, W., Bao, W., Zhai, G., Chen, L., Min, X., & Gao, Z. (2020). Blurry video frame interpolation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5114–5123).
Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1874–1883).
Shi, Z., Liu, X., Shi, K., Dai, L., & Chen, J. (2021). Video frame interpolation via generalized deformable convolution. IEEE Trans. Multimedia, 24, 426–439.
Article Google Scholar
Sim, H., Oh, J., & Kim, M. (2021). Xvfi: Extreme video frame interpolation. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 14489–14498).
Siyao, L., Zhao, S., Yu, W., Sun, W., Metaxas, D., Loy, C. C., & Liu, Z. (2021). Deep animation video interpolation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6587–6595).
Sun, D., Yang, X., Liu, M. Y., & Kautz, J. (2018). Pwc-net: CNNs for optical flow using pyramid, war**, and cost volume. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8934–8943).
Teed, Z., & Deng, J. (2020). Raft: Recurrent all-pairs field transforms for optical flow. European Conference on Computer Vision 402–419.
Tulyakov, S., Gehrig, D., Georgoulis, S., Erbach, J., Gehrig, M., Li, Y., & Scaramuzza, D. (2021). Time lens: Event-based video frame interpolation. Im Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 16155–16164).
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process., 13(4), 600–612.
Article Google Scholar
Wu, C. Y., Singhal, N., & Krahenbuhl, P. (2018). Video compression through image interpolation. In European conference on computer vision (pp. 416–431).
** method for video frame interpolation. In IEEE International conference on multimedia and exPO (PP. 1–6).
Xue, T., Chen, B., Wu, J., Wei, D., & Freeman, W. T. (2019). Video enhancement with task-oriented flow. International Journal of Computer Vision, 127(8), 1106–1125.
Article Google Scholar
Zhang, H., Zhao, Y., & Wang, R. (2020). A flexible recurrent residual pyramid network for video frame interpolation. In European conference on computer vision (pp. 474–491).

Download references

Author information

Yu Li and Ye Zhu have contributed equally to this work.

Authors and Affiliations

International Digital Economy Academy (IDEA), Shenzhen, 518048, China
Yu Li
ARC Lab, Tencent PCG, Shenzhen, 518066, China
Ye Zhu, **ntao Wang, Yue Luo & Ying Shan
National University of Singapore, 117583, Singapore, Singapore
Ruoteng Li

Authors

Yu Li
View author publications
You can also search for this author in PubMed Google Scholar
Ye Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Ruoteng Li
View author publications
You can also search for this author in PubMed Google Scholar
**ntao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Luo
View author publications
You can also search for this author in PubMed Google Scholar
Ying Shan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Li.

Additional information

Communicated by Shaodi You.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (mp4 24209 KB)

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, Y., Zhu, Y., Li, R. et al. Hybrid War** Fusion for Video Frame Interpolation. Int J Comput Vis 130, 2980–2993 (2022). https://doi.org/10.1007/s11263-022-01683-9

Download citation

Received: 31 January 2022
Accepted: 02 September 2022
Published: 23 September 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s11263-022-01683-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hybrid War** Fusion for Video Frame Interpolation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Flow-aware synthesis: A generic motion model for video frame interpolation

A Multi-frame Video Interpolation Neural Network for Large Motion

MVFI-Net: Motion-Aware Video Frame Interpolation Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Hybrid War** Fusion for Video Frame Interpolation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Flow-aware synthesis: A generic motion model for video frame interpolation

A Multi-frame Video Interpolation Neural Network for Large Motion

MVFI-Net: Motion-Aware Video Frame Interpolation Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation