Log in

Lossless audio codec based on CNN, weighted tree and arithmetic encoding (LACCWA)

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In this paper, an integrated lossless audio codec with improved compression ratio has been proposed using traditional methods as well as convolutional neural network (CNN) architecture without losing any audio information. The method applies adaptive arithmetic encoding followed by binary weighted tree based transform and convolutional neural network (CNN) respectively. The first level of compression uses adaptive arithmetic encoding to process audio samples block by block, reducing the size of the data to the number of blocks. Subsequently, each of the arithmetic encoded values for every block is encoded by the binary weighted tree based encoding. Estimating a dynamic weighted path, it transforms the arithmetic-encoded data into an equivalent binary pattern. The binary stream is further compressed with latent space representations using a proposed CNN architecture. The analysis of the simulation results are performed using various statistical and robustness characteristics and comparison with other existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Data Availibility

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

  1. Rostami M, Oussalah M, Berahmand K, Farrahi V (2023) Community detection algorithms in healthcare applications: a systematic review. IEEE Access 11:30247–30272

  2. Hans M, Schafer RW (2001) Lossless compression of digital audio. IEEE Signal Process Mag 18(4):21–32

    Article  Google Scholar 

  3. Khalifeh AF, Al-Tamimi AK, Darabkh KA (2017) Perceptual evaluation of audio quality under lossy networks. In: 2017 International conference on wireless communications, signal processing and networking (WiSPNET). IEEE, pp 939–943

  4. Cellier C, Chênes P, Rossi M (1993) Lossless audio data compression for real-time applications. In: Audio engineering society convention 95, audio engineering society. http://infoscience.epfl.ch/record/96397

  5. Matthew TA Version 7.63. https://monkeysaudio.com/index.html. Accessed 27 Apr 2022

  6. David B Version 5.4.0. http://www.wavpack.com. Accessed 10 Jan 2021

  7. Coalson J () **ph. org foundation, “flac: free lossless audio codec” Version 1.3.4. https://xiph.org/flac/index.html. Accessed 20 Feb 2022

  8. Rim DN, Jang I, Choi H (2021) Deep neural networks and end-to-end learning for audio compression. ar**v:2105.11681

  9. Nowak N, Zabierowski W (2011) Methods of sound data compression-comparison of different standards. Radio Electron Inf 4:92–95

    Google Scholar 

  10. Witten IH, Neal RM, Cleary JG (1987) Arithmetic coding for data compression. Commun ACM 30(6):520–540

    Article  Google Scholar 

  11. Reznik YA (2004) Coding of prediction residual in mpeg-4 standard for lossless audio coding (mpeg-4 als). IEEE International conference on acoustics, speech, and signal processing, montreal, QC, Canada 3:iii–1024

  12. Liebchen T (2009) Mpeg-4 als-the standard for lossless audio coding. J Acoust Soc Korea 28(7):618–629

    Google Scholar 

  13. Moriya T, Yang DT, Liebchen T (2004) Extended linear prediction tools for lossless audio coding. In: 2004 IEEE International conference on acoustics, speech, and signal processing, IEEE, vol 3, pp iii–1008

  14. Huang H, Franti P, Huang D, Rahard ja S (2008) Cascaded rls-lms prediction in mpeg-4 lossless audio coding. IEEE Trans Audio Speech Lang Process 16(3):554–562

  15. Yu R, Lin X, Rahardja S, Huang H (2005) Mpeg-4 scalable to lossless audio coding-emerging international standard for digital audio compression. In: 2005 IEEE 7th Workshop on multimedia signal processing. IEEE, pp 1–4

  16. Gao Y (2009) Audio coding standard overview: Mpeg4-aac, he-aac, and he-aac v2. Mobile multimedia broadcasting standards: technology and practice pp 607–627. ISBN: 978-0-387-78262-1. https://doi.org/10.1007/978-0-387-78263-8_21

  17. Wei B, Wang J, Gibson JD (2001) Enhanced celp coding with discrete spectral modeling. In: Proceedings of 2001 International symposium on intelligent multimedia, video and speech processing. ISIMP 2001 (IEEE Cat. No. 01EX489). IEEE, pp 111–113

  18. Wei B, Gibson JD (2003) A new discrete spectral modeling method and an application to celp coding. IEEE Signal Process Lett 10(4):101–103

    Article  Google Scholar 

  19. Gao W, Huang T, Reader C, Dou W, Chen X (2014) IEEE standards for advanced audio and video coding in emerging applications. Computer 47(5):81–83

  20. Gunawan TS, Zain MKM, Muin FA, Kartiwi M (2017) Investigation of lossless audio compression using ieee 1857.2 advanced audio coding. Indones J Electr Eng Comput Sci 6(2):422–430

  21. Kankanahalli S (2018) End-to-end optimized speech coding with deep neural networks. In: 2018 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 2521–2525

  22. Ulacha G, Wernik C (2019) A high efficiency multistage coder for lossless audio compression using ols+ and cdccr method. Appl Sci 9(23):5218

    Article  Google Scholar 

  23. Mondal UK, Debnath A (2021) Develo** a dynamic cluster quantization based lossless audio compression (dcqlac). Multimedia Tools Appl 80(6):8257–8280

    Article  Google Scholar 

  24. Mondal UK, Debnath A (2022) Designing a novel lossless audio compression technique with the help of optimized graph traversal (lacogt). Multimedia Tools Appl 81(28):40385–40411

    Article  Google Scholar 

  25. Freitag M, Amiriparian S, Pugachevskiy S, Cummins N, Schuller B (2017) audeep: unsupervised learning of representations from audio with deep recurrent neural networks. J Mach Learn Res 18(1):6340–6344

    MathSciNet  Google Scholar 

  26. Sheikhpour R, Berahmand K, Forouzandeh S (2023) Hessian-based semi-supervised feature selection using generalized uncorrelated constraint. Knowl-Based Syst 269:110521

    Article  Google Scholar 

  27. Said A (2023) Introduction to arithmetic coding–theory and practice. ar**v:2302.00819

  28. Developer - Google TensorFlow v2.8.0 https://www.tensorflow.org/api_docs/python/tf/keras. Accessed 02 Feb 2022

  29. Mondal UK, Debnath A, Tabassum N, Mandal J (2023) Designing an iterative adaptive arithmetic coding-based lossless bio-signal compression for online patient monitoring system (iaalbc) pp 655–664. ISBN=978-981-19-5191-6. https://doi.org/10.1007/978-981-19-5191-6_53

  30. Manju M, Abarna P, Akila U, Yamini S (2018) Peak signal to noise ratio & mean square error calculation for various images using the lossless image compression in ccsds algorithm. Int J Pure Appl Math 119(12):14471–14477

    Google Scholar 

  31. Streijl RC, Winkler S, Hands DS (2016) Mean opinion score (mos) revisited: methods and applications, limitations and alternatives. Multimedia Syst 22(2):213–227

    Article  Google Scholar 

  32. Willmott CJ, Matsuura K (2005) Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance. Clim Res 30(1):79–82

    Article  Google Scholar 

  33. Nakajima H, Takahashi Y, Kondo K, Hisaminato Y (2018) Monaural source enhancement maximizing source-to-distortion ratio via automatic differentiation. ar**v:1806.05791

  34. Pedro HT, Larson DP, Coimbra CF (2019) A comprehensive dataset for the accelerated development and benchmarking of solar forecasting methods. J Renew Sustain Energy 11(3):036102

    Article  Google Scholar 

Download references

Funding

No funds, grants, or other support was received.

Author information

Authors and Affiliations

Authors

Contributions

The both authors confirm the responsibility for the following: study conception and design, data collection, analysis and interpretation of results, and manuscript preparation.

Corresponding author

Correspondence to Uttam Kr. Mondal.

Ethics declarations

Conflicts of interest

All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Debnath, A., Mondal, U. Lossless audio codec based on CNN, weighted tree and arithmetic encoding (LACCWA). Multimed Tools Appl 83, 48737–48759 (2024). https://doi.org/10.1007/s11042-023-17393-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17393-4

Keywords

Navigation