Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

  • 2571 Accesses

The goal of this chapter is to provide basic notions about digital audio processing technologies. These are applied in many everyday life products such as phones, radio and television, videogames, CD players, cellular phones, etc. However, although there is a wide spectrum of applications, the main problems to be addressed in order to manipulate digital sound are essentially three: acquisition, representation and storage. The acquisition is the process of converting the physical phenomenon we call sound into a form suitable for digital processing, the representation is the problem of extracting from the sound information necessary to perform a specific task, and the storage is the problem of reducing the number of bits necessary to encode the acoustic signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, International Telecom-munication Union, 1997.

    Google Scholar 

  2. L.L. Beranek. Concert hall acoustics. Journal of the Acoustical Aociety of America, 92(1), 1992.

    Google Scholar 

  3. D.T. Blackstock. Fundamentals of Physical Acoustics. John Wiley and Sons, 2000.

    Google Scholar 

  4. J. Bormans, J. Gelissen, and A. Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2), 2003.

    Google Scholar 

  5. M. Bosi and R.E. Goldberg. Introduction to Digital Audio Coding and Standards. Kluwer, 2003.

    Google Scholar 

  6. J.C. Brown. Determination of meter of musical scores by autocorrelation. Jour-nal of the Acoustical Aociety of America, 94(4), 1993.

    Google Scholar 

  7. I. Burnett, R. Van der Walle, K. Hill, J. Bormans, and F. Pereira. MPEG-21: goals and achievements. IEEE Multimedia, 10(4), 2003.

    Google Scholar 

  8. M.J. Carey, E.S. Parris, and H. Lloyd-Thomas. A comparison of features for speech-music discrimination. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 149-152, 1999.

    Google Scholar 

  9. J.C. Catford. Theoretical Acoustics. Oxford University Press, 2002.

    Google Scholar 

  10. P. Cummiskey. Adaptive quantization in differential PCM coding of speech. Bell Systems Technical Journal, 7:1105, 1973.

    Google Scholar 

  11. T.F.W. Embleton. Tutorial on sound propagation outdoors. Journal of the Acoustical Society of America, 100(1), 1996.

    Google Scholar 

  12. H. Fletcher. Auditory patterns. Review of Modern Physics, pages 47-65, 1940.

    Google Scholar 

  13. A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith. Query by humming: musical information retrieval in audio database. In Proceedings of the ACM Conference on Multimedia, pages 231-236, 1995.

    Google Scholar 

  14. A. Hanjalic and L.-Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143-154, 2005.

    Article  Google Scholar 

  15. X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice-Hall, 2001.

    Google Scholar 

  16. L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders. Fundamentals of Acoustics. John Wiley and Sons, New York, 2000.

    Google Scholar 

  17. P. Ladefoged. Vowels and consonants. Blackwell Publishing, 2001.

    Google Scholar 

  18. C.M. Lee and S.S. Narayanan. Toward detecting emotions in spoken dialogs. IEEE Transactions on Multimedia, 13(2):293-303, 2005.

    Google Scholar 

  19. L. Lu, H. Jiang, and H.J. Zhang. A robust audio classification and segmentation method. In Proceedings of the ACM Conference on Multimedia, pages 203-211, 2001.

    Google Scholar 

  20. Y.-F. Ma, X.-S Hua, L. Lu, and H.-J. Zhang. A generic framework for user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5):907-919, 2005.

    Google Scholar 

  21. B.S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7. John Wiley and Sons, Chichester, UK, 2002.

    Google Scholar 

  22. S.K. Mitra. Digital Signal Processing - A Computer Based Approach. McGraw- Hill, 1998.

    Google Scholar 

  23. B.C.J. Moore. An Introduction to the Psychology of Hearing. Academic Press, 1997.

    Google Scholar 

  24. P.M. Morse and K. Ingard. Theoretical Acoustics. McGraw-Hill, 1968.

    Google Scholar 

  25. P. Noll. Wideband speech and audio coding. IEEE Communications Magazine, (11):34-44, november 1993.

    Google Scholar 

  26. P. Noll. MPEG digital audio coding. IEEE Signal Processing Magazine, 14(5):59-81, 1997.

    Article  MathSciNet  Google Scholar 

  27. B.M. Oliver, J. Pierce, and C.E. Shannon. The philosophy of PCM. Proceedings of IEEE, 36:1324-1331, 1948.

    Google Scholar 

  28. A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice- Hall, 1989.

    Google Scholar 

  29. T. Painter and A. Spanias. Perceptual coding of digital audio. Proceedings of IEEE, 88(4):451-513, 2000.

    Article  Google Scholar 

  30. J.O. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 1988.

    Google Scholar 

  31. L. Rabiner. On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):24-33, 1977.

    Article  Google Scholar 

  32. L.R. Rabiner and M.R. Sambur. Algorithm for determining the endpoints of isolated utterances. Journal of the Acoustical Society of America, 56(S1), 1974.

    Google Scholar 

  33. L.R. Rabiner and R.W. Schafer, editors. Digital Processing of Speech Signals. Prentice-Hall, 1978.

    Google Scholar 

  34. E. Scheirer and M. Slaney. Construction and evaluation of a robust multifea-ture speech/music discriminator. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1331-1334, 1997.

    Google Scholar 

  35. A. Spanias. Speech coding: a tutorial review. Proceedings of IEEE, 82(10):1541-1582,1994.

    Article  Google Scholar 

  36. S. Sukittanon and L.E. Atlas. Modulation frequency features for audio finger-printing. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1773-1776, 2002.

    Google Scholar 

  37. E. Wold, T. Blum, D. Keislar, and J. Wheaton. Content-based classification, search and retrieval of audio. IEEE Multimedia, 3(3), 1996.

    Google Scholar 

Download references

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer

About this chapter

Cite this chapter

(2008). Audio Acquisition, Representation and Storage. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84800-007-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-84800-007-0_2

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84800-006-3

  • Online ISBN: 978-1-84800-007-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation