Machine Learning for Multiscale Video Coding

Gashnikov, M. V.

doi:10.3103/S1060992X23030037

Machine Learning for Multiscale Video Coding

Published: 25 September 2023

Volume 32, pages 189–196, (2023)
Cite this article

Optical Memory and Neural Networks Aims and scope Submit manuscript

M. V. Gashnikov¹

81 Accesses
Explore all metrics

Abstract

The research concerns the use of machine learning algorithms for multiscale coding of digital video sequences. Based on machine learning, the digital image coder is generalized to the coding of video sequences. To this end, we offer an algorithm that allows for videoframes interdependency by using linear regression. The generalized image coder uses multiscale representation of videoframes, neural network three-dimensional interpolation of multiscale videoframe interpretation levels and generative-adversarial neural net replacement of homogeneous portions of a videoframe by synthetic video data. The method of coding the entire video and method of coding videoframes are exemplified by block diagrams. Formalized description of how videoframe correlation is taken into account is given. Real video sequences are used to carry out numerical experiments. The experimental data allow us to make a conclusion about the promise of using the algorithm in video coding and processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4.

REFERENCES

Hoang, T.M. and Zhou, J., Recent trending on learning based video compression: A survey, Cognit. Rob., 2021, vol. 1, pp. 145–158.
Google Scholar
Yasin, H.M. and Ameen, S.Y., Review and evaluation of end-to-end video compression with deep-learning, in 2021 International Conference of Modern Trends in Information and Communication Technology Industry (MTICTI), IEEE, 2021, pp. 1–8.
Saideni, W., Helbert, D., Courreges, F., and Cances, J.P., An overview on deep learning techniques for video compressive sensing, Appl. Sci., 2022, vol. 12, no. 5, p. 2734.
Article Google Scholar
Chen, Z., Lu, G., Hu, Z., Liu, S., Jiang, W., and Xu, D., LSVC: A learning-based stereo video compression framework, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6073–6082.
Mandhane, A., Zhernov, A., Rauh, M., Gu, C., Wang, M., Xue, F., … and Mann, T., Muzero with self-competition for rate control in vp9 video compression, 2022. ar**v preprint ar**v:2202.06626.
Chen, M.J., Lee, C.A., Tsai, Y.H., Yang, C.M., Yeh, C.H., Kau, L.J., and Chang, C.Y., Efficient partition decision based on visual perception and machine learning for H. 266/Versatile video coding, IEEE Access, 2022, vol. 10, pp. 42141–42150.
Article Google Scholar
Mentzer, F., Toderici, G., Minnen, D., Hwang, S.J., Caelles, S., Lucic, M., and Agustsson, E., Vct: A video compression transformer, 2022. ar**v preprint ar**v:2206.07307.
Zhang, Q., Wang, S., Zhang, X., Jia, C., Pan, J., Ma, S., and Gao, W., SMR: Satisfied Machine Ratio Modeling for Machine Recognition-Oriented Image and Video Compression, 2022. ar**v preprint ar**v:2211.06797.
Duong, L.R., Li, B., Chen, C., and Han, J., Multi-rate adaptive transform coding for video compression, 2022. ar**v preprint ar**v:2210.14308.
Gashnikov, M.V., Use of neural networks and decision trees in compression of 2D and 3D digital signals, Opt. Mem. Neural Networks, 2022, vol. 31, no. 4, pp. 379–392.
Article Google Scholar
Sergeyev, V.V, Glumov, N.I., and Gashnikov, M.V., Compression rate control during hierarchical image compression, 7th Int. Conference on Pattern Recognition and image analysis: New Information Technologies, 2004, vol. 1, pp. 217–219.
Dynamic Scenes Data Set. http://vision.eecs.yorku.ca/research/dynamic-scenes.

Download references

Funding

The research was supported by the Russian Science Fund, project no. 22‑21‑00662.

Author information

Authors and Affiliations

Samara National Research University, 443086, Samara, Russia
M. V. Gashnikov

Authors

M. V. Gashnikov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. V. Gashnikov.

Ethics declarations

The author declares that he has no conflicts of interest.

About this article

Cite this article

Gashnikov, M.V. Machine Learning for Multiscale Video Coding. Opt. Mem. Neural Networks 32, 189–196 (2023). https://doi.org/10.3103/S1060992X23030037

Download citation

Received: 16 February 2023
Revised: 30 May 2023
Accepted: 02 June 2023
Published: 25 September 2023
Issue Date: September 2023
DOI: https://doi.org/10.3103/S1060992X23030037

Key words:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions