Audio Acquisition, Representation and Storage

doi:10.1007/978-1-84800-007-0_2

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

2571 Accesses

The goal of this chapter is to provide basic notions about digital audio processing technologies. These are applied in many everyday life products such as phones, radio and television, videogames, CD players, cellular phones, etc. However, although there is a wide spectrum of applications, the main problems to be addressed in order to manipulate digital sound are essentially three: acquisition, representation and storage. The acquisition is the process of converting the physical phenomenon we call sound into a form suitable for digital processing, the representation is the problem of extracting from the sound information necessary to perform a specific task, and the storage is the problem of reducing the number of bits necessary to encode the acoustic signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems. Technical report, International Telecom-munication Union, 1997.
Google Scholar
L.L. Beranek. Concert hall acoustics. Journal of the Acoustical Aociety of America, 92(1), 1992.
Google Scholar
D.T. Blackstock. Fundamentals of Physical Acoustics. John Wiley and Sons, 2000.
Google Scholar
J. Bormans, J. Gelissen, and A. Perkis. MPEG-21: The 21st century multimedia framework. IEEE Signal Processing Magazine, 20(2), 2003.
Google Scholar
M. Bosi and R.E. Goldberg. Introduction to Digital Audio Coding and Standards. Kluwer, 2003.
Google Scholar
J.C. Brown. Determination of meter of musical scores by autocorrelation. Jour-nal of the Acoustical Aociety of America, 94(4), 1993.
Google Scholar
I. Burnett, R. Van der Walle, K. Hill, J. Bormans, and F. Pereira. MPEG-21: goals and achievements. IEEE Multimedia, 10(4), 2003.
Google Scholar
M.J. Carey, E.S. Parris, and H. Lloyd-Thomas. A comparison of features for speech-music discrimination. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 149-152, 1999.
Google Scholar
J.C. Catford. Theoretical Acoustics. Oxford University Press, 2002.
Google Scholar
P. Cummiskey. Adaptive quantization in differential PCM coding of speech. Bell Systems Technical Journal, 7:1105, 1973.
Google Scholar
T.F.W. Embleton. Tutorial on sound propagation outdoors. Journal of the Acoustical Society of America, 100(1), 1996.
Google Scholar
H. Fletcher. Auditory patterns. Review of Modern Physics, pages 47-65, 1940.
Google Scholar
A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith. Query by humming: musical information retrieval in audio database. In Proceedings of the ACM Conference on Multimedia, pages 231-236, 1995.
Google Scholar
A. Hanjalic and L.-Q. Xu. Affective video content representation and modeling. IEEE Transactions on Multimedia, 7(1):143-154, 2005.
Article Google Scholar
X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice-Hall, 2001.
Google Scholar
L.E. Kinsler, A.R. Frey, A.B. Coppens, and J.V. Sanders. Fundamentals of Acoustics. John Wiley and Sons, New York, 2000.
Google Scholar
P. Ladefoged. Vowels and consonants. Blackwell Publishing, 2001.
Google Scholar
C.M. Lee and S.S. Narayanan. Toward detecting emotions in spoken dialogs. IEEE Transactions on Multimedia, 13(2):293-303, 2005.
Google Scholar
L. Lu, H. Jiang, and H.J. Zhang. A robust audio classification and segmentation method. In Proceedings of the ACM Conference on Multimedia, pages 203-211, 2001.
Google Scholar
Y.-F. Ma, X.-S Hua, L. Lu, and H.-J. Zhang. A generic framework for user attention model and its application in video summarization. IEEE Transactions on Multimedia, 7(5):907-919, 2005.
Google Scholar
B.S. Manjunath, P. Salembier, and T. Sikora, editors. Introduction to MPEG-7. John Wiley and Sons, Chichester, UK, 2002.
Google Scholar
S.K. Mitra. Digital Signal Processing - A Computer Based Approach. McGraw- Hill, 1998.
Google Scholar
B.C.J. Moore. An Introduction to the Psychology of Hearing. Academic Press, 1997.
Google Scholar
P.M. Morse and K. Ingard. Theoretical Acoustics. McGraw-Hill, 1968.
Google Scholar
P. Noll. Wideband speech and audio coding. IEEE Communications Magazine, (11):34-44, november 1993.
Google Scholar
P. Noll. MPEG digital audio coding. IEEE Signal Processing Magazine, 14(5):59-81, 1997.
Article MathSciNet Google Scholar
B.M. Oliver, J. Pierce, and C.E. Shannon. The philosophy of PCM. Proceedings of IEEE, 36:1324-1331, 1948.
Google Scholar
A.V. Oppenheim and R.W. Schafer. Discrete-Time Signal Processing. Prentice- Hall, 1989.
Google Scholar
T. Painter and A. Spanias. Perceptual coding of digital audio. Proceedings of IEEE, 88(4):451-513, 2000.
Article Google Scholar
J.O. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 1988.
Google Scholar
L. Rabiner. On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech and Signal Processing, 25(1):24-33, 1977.
Article Google Scholar
L.R. Rabiner and M.R. Sambur. Algorithm for determining the endpoints of isolated utterances. Journal of the Acoustical Society of America, 56(S1), 1974.
Google Scholar
L.R. Rabiner and R.W. Schafer, editors. Digital Processing of Speech Signals. Prentice-Hall, 1978.
Google Scholar
E. Scheirer and M. Slaney. Construction and evaluation of a robust multifea-ture speech/music discriminator. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1331-1334, 1997.
Google Scholar
A. Spanias. Speech coding: a tutorial review. Proceedings of IEEE, 82(10):1541-1582,1994.
Article Google Scholar
S. Sukittanon and L.E. Atlas. Modulation frequency features for audio finger-printing. In Proceedings of the IEEE Conference on Acoustics, Speech and Signal Processing, pages 1773-1776, 2002.
Google Scholar
E. Wold, T. Blum, D. Keislar, and J. Wheaton. Content-based classification, search and retrieval of audio. IEEE Multimedia, 3(3), 1996.
Google Scholar

Download references

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Audio Acquisition, Representation and Storage. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-84800-007-0_2

Download citation

DOI: https://doi.org/10.1007/978-1-84800-007-0_2
Publisher Name: Springer, London
Print ISBN: 978-1-84800-006-3
Online ISBN: 978-1-84800-007-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics