Abstract
In this paper, an integrated lossless audio codec with improved compression ratio has been proposed using traditional methods as well as convolutional neural network (CNN) architecture without losing any audio information. The method applies adaptive arithmetic encoding followed by binary weighted tree based transform and convolutional neural network (CNN) respectively. The first level of compression uses adaptive arithmetic encoding to process audio samples block by block, reducing the size of the data to the number of blocks. Subsequently, each of the arithmetic encoded values for every block is encoded by the binary weighted tree based encoding. Estimating a dynamic weighted path, it transforms the arithmetic-encoded data into an equivalent binary pattern. The binary stream is further compressed with latent space representations using a proposed CNN architecture. The analysis of the simulation results are performed using various statistical and robustness characteristics and comparison with other existing methods.
Similar content being viewed by others
Data Availibility
Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
References
Rostami M, Oussalah M, Berahmand K, Farrahi V (2023) Community detection algorithms in healthcare applications: a systematic review. IEEE Access 11:30247–30272
Hans M, Schafer RW (2001) Lossless compression of digital audio. IEEE Signal Process Mag 18(4):21–32
Khalifeh AF, Al-Tamimi AK, Darabkh KA (2017) Perceptual evaluation of audio quality under lossy networks. In: 2017 International conference on wireless communications, signal processing and networking (WiSPNET). IEEE, pp 939–943
Cellier C, Chênes P, Rossi M (1993) Lossless audio data compression for real-time applications. In: Audio engineering society convention 95, audio engineering society. http://infoscience.epfl.ch/record/96397
Matthew TA Version 7.63. https://monkeysaudio.com/index.html. Accessed 27 Apr 2022
David B Version 5.4.0. http://www.wavpack.com. Accessed 10 Jan 2021
Coalson J () **ph. org foundation, “flac: free lossless audio codec” Version 1.3.4. https://xiph.org/flac/index.html. Accessed 20 Feb 2022
Rim DN, Jang I, Choi H (2021) Deep neural networks and end-to-end learning for audio compression. ar**v:2105.11681
Nowak N, Zabierowski W (2011) Methods of sound data compression-comparison of different standards. Radio Electron Inf 4:92–95
Witten IH, Neal RM, Cleary JG (1987) Arithmetic coding for data compression. Commun ACM 30(6):520–540
Reznik YA (2004) Coding of prediction residual in mpeg-4 standard for lossless audio coding (mpeg-4 als). IEEE International conference on acoustics, speech, and signal processing, montreal, QC, Canada 3:iii–1024
Liebchen T (2009) Mpeg-4 als-the standard for lossless audio coding. J Acoust Soc Korea 28(7):618–629
Moriya T, Yang DT, Liebchen T (2004) Extended linear prediction tools for lossless audio coding. In: 2004 IEEE International conference on acoustics, speech, and signal processing, IEEE, vol 3, pp iii–1008
Huang H, Franti P, Huang D, Rahard ja S (2008) Cascaded rls-lms prediction in mpeg-4 lossless audio coding. IEEE Trans Audio Speech Lang Process 16(3):554–562
Yu R, Lin X, Rahardja S, Huang H (2005) Mpeg-4 scalable to lossless audio coding-emerging international standard for digital audio compression. In: 2005 IEEE 7th Workshop on multimedia signal processing. IEEE, pp 1–4
Gao Y (2009) Audio coding standard overview: Mpeg4-aac, he-aac, and he-aac v2. Mobile multimedia broadcasting standards: technology and practice pp 607–627. ISBN: 978-0-387-78262-1. https://doi.org/10.1007/978-0-387-78263-8_21
Wei B, Wang J, Gibson JD (2001) Enhanced celp coding with discrete spectral modeling. In: Proceedings of 2001 International symposium on intelligent multimedia, video and speech processing. ISIMP 2001 (IEEE Cat. No. 01EX489). IEEE, pp 111–113
Wei B, Gibson JD (2003) A new discrete spectral modeling method and an application to celp coding. IEEE Signal Process Lett 10(4):101–103
Gao W, Huang T, Reader C, Dou W, Chen X (2014) IEEE standards for advanced audio and video coding in emerging applications. Computer 47(5):81–83
Gunawan TS, Zain MKM, Muin FA, Kartiwi M (2017) Investigation of lossless audio compression using ieee 1857.2 advanced audio coding. Indones J Electr Eng Comput Sci 6(2):422–430
Kankanahalli S (2018) End-to-end optimized speech coding with deep neural networks. In: 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2521–2525
Ulacha G, Wernik C (2019) A high efficiency multistage coder for lossless audio compression using ols+ and cdccr method. Appl Sci 9(23):5218
Mondal UK, Debnath A (2021) Develo** a dynamic cluster quantization based lossless audio compression (dcqlac). Multimedia Tools Appl 80(6):8257–8280
Mondal UK, Debnath A (2022) Designing a novel lossless audio compression technique with the help of optimized graph traversal (lacogt). Multimedia Tools Appl 81(28):40385–40411
Freitag M, Amiriparian S, Pugachevskiy S, Cummins N, Schuller B (2017) audeep: unsupervised learning of representations from audio with deep recurrent neural networks. J Mach Learn Res 18(1):6340–6344
Sheikhpour R, Berahmand K, Forouzandeh S (2023) Hessian-based semi-supervised feature selection using generalized uncorrelated constraint. Knowl-Based Syst 269:110521
Said A (2023) Introduction to arithmetic coding–theory and practice. ar**v:2302.00819
Developer - Google TensorFlow v2.8.0 https://www.tensorflow.org/api_docs/python/tf/keras. Accessed 02 Feb 2022
Mondal UK, Debnath A, Tabassum N, Mandal J (2023) Designing an iterative adaptive arithmetic coding-based lossless bio-signal compression for online patient monitoring system (iaalbc) pp 655–664. ISBN=978-981-19-5191-6. https://doi.org/10.1007/978-981-19-5191-6_53
Manju M, Abarna P, Akila U, Yamini S (2018) Peak signal to noise ratio & mean square error calculation for various images using the lossless image compression in ccsds algorithm. Int J Pure Appl Math 119(12):14471–14477
Streijl RC, Winkler S, Hands DS (2016) Mean opinion score (mos) revisited: methods and applications, limitations and alternatives. Multimedia Syst 22(2):213–227
Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Clim Res 30(1):79–82
Nakajima H, Takahashi Y, Kondo K, Hisaminato Y (2018) Monaural source enhancement maximizing source-to-distortion ratio via automatic differentiation. ar**v:1806.05791
Pedro HT, Larson DP, Coimbra CF (2019) A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods. J Renew Sustain Energy 11(3):036102
Funding
No funds, grants, or other support was received.
Author information
Authors and Affiliations
Contributions
The both authors confirm the responsibility for the following: study conception and design, data collection, analysis and interpretation of results, and manuscript preparation.
Corresponding author
Ethics declarations
Conflicts of interest
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Debnath, A., Mondal, U. Lossless audio codec based on CNN, weighted tree and arithmetic encoding (LACCWA). Multimed Tools Appl 83, 48737–48759 (2024). https://doi.org/10.1007/s11042-023-17393-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17393-4