Auditory-Unit Formation

  • Chapter
  • First Online:
The Perceptual Structure of Sound

Part of the book series: Current Research in Systematic Musicology ((CRSM,volume 11))

  • 2488 Accesses

Abstract

The previous chapters described how incoming acoustic information is distributed over a large array of auditory filters. This information generally does not originate from one sound source, but can come from multiple sound sources, such as at a cocktail party, in a bustling restaurant, or on a market place. In these circumstances, the auditory system must try to figure out which sound components come from which sound source, and reassemble the auditory information is such a way that the listener can meaningfully interpret the auditory environment in terms of auditory events such as footsteps, closing doors, human speech, musical melodies, etc., etc. This is called auditory scene analysis. In this chapter, the first stage of this process will be described, the formation of auditory units. An auditory unit is a sound to which the auditory system attributes one perceived moment of occurrence indicated by its beat. In spoken speech, the auditory units will in general be identified with the syllables that constitute the speech. It will be argued that their beats are formed by the clusters of onsets defined by the first consonants and the vowel of that syllable. In music, the separate tones of the melodies are the auditory units. Their beats are naturally formed by the onsets of the frequency components that make up the tones. Since every auditory unit has a beat, auditory units can be counted, which is why they are called units. This chapter describes the organizing principles operating in the process of auditory-unit formation. It is argued that the attributes of the auditory units such as their timbre, their loudness, and their pitch, emerge from the process of auditory unit formation, and are thus defined by the auditory information that contributes to the formation of these auditory units.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free ship** worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barbosa PA, Bailly G (1994) Characterisation of rhythmic patterns for text-to-speech synthesis. Speech Commun 15(1–2):127–137. https://doi.org/10.1016/0167-6393(94)90047-7.

  2. Bizley JK, Cohen YE (2013) The what, where and how of auditory-object perception. Nat Rev Neurosci 14(10):693–707. https://doi.org/10.1038/nrn3565.

  3. Blauert J (1997) Spatial hearing: the psychophysics of human sound localization, Revised. MIT Press, Cambridge, MA

    Google Scholar 

  4. Bregman AS (1990) Auditory scene analysis: the perceptual organization of sound. MIT Press, Cambridge, MA

    Book  Google Scholar 

  5. Bregman AS (2008) Rhythms emerge from the perceptual grou** of acoustic components. In: Proceedings of Fechner Day, vol 24 (1), pp 13–16. http://proceedings.fechnerday.com/index.php/proceedings/article/view/163

  6. Bregman AS, Ahad PA (1996) Demonstrations of scene analysis: the perceptual organization of sound, Montreal, Canada. http://webpages.mcgill.ca/staff/Group2/abregm1/web/downloadsdl.htm

  7. Bregman AS, Ahad PA, Kim J (1994) Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlap** tones. J Acoust Soc Am 96(5):2694–2703. https://doi.org/10.1121/1.411277

  8. Bregman AS, Campbell J (1971) Primary auditory stream segregation and perception of order in rapid sequences of tones. J Exp Psychol 89(2):244–249. https://doi.org/10.1037/h0031163

  9. Brooks JL (2015) Traditional and new principles of perceptual grou**. In: Wagemans J (ed) The oxford handbook of perceptual organization, Oxford University Press, Oxford, UK, Chap. 4, p 31. https://kar.kent.ac.uk/35324/1/Brooks-Grou**Chapter-OUPHandbook-REPOSITORY.pdf

  10. Burghardt H (1973) Die subjektive dauer schmalbandiger schalle bei verschiedenen frequenzlagen. Acust 28(5):278–284

    Google Scholar 

  11. Cabrera D, Pop C, Jeong D (2006) Auditory room size perception: a comparison of real versus binaural sound-fields. In: Proceedings of the 1st Australasian acoustic societies’ conference (Acoustics 2000), Christchurch, New Zealand, pp 417–422. https://www.acoustics.asn.au/conference_proceedings/AASNZ2006/papers/p107.pdf

  12. Carlyon RP (2004) How the brain separates sounds. Trends Cogn Sci 8(10):465–471. https://doi.org/10.1016/j.tics.2004.08.008

  13. Carlyon RP, Gockel HE (2008) Effects of harmonicity and regularity on the perception of sound sources. In: Yost WA, Popper AN, Fay RR (eds) Auditory perception of sound sources, Springer Science+Business Media, New York, Chap. 7, pp 191–213. https://doi.org/10.1007/978-0-387-71305-2_7

  14. Carlyon RP et al (2009) Changes in the perceived duration of a narrowband sound induced by a preceding stimulus. J Exp Psychol Hum Percept Perform 35(6):1898–1912. https://doi.org/10.1037/a0015018

  15. Chen L (2019) Discrimination of empty and filled intervals marked by auditory signals with different durations and directions of intensity change. PsyCh J 8(2):187–202. https://doi.org/10.1002/pchj.267

  16. Ciocca V (1999) Evidence against an effect of grou** by spectral regularity on the perception of virtual pitch. J Acoust Soc Am 106(5):2746–2751. https://doi.org/10.1121/1.428102

  17. Craig JC (1973) A constant error in the perception of brief temporal intervals. Percept & Psychophys 13(1):99–104. https://doi.org/10.3758/BF03207241.

  18. Crum PAC, Bregman AS (2006) Effects of unit formation on the perception of a changing sound. Q J Exp Psychol 59(3):543–556. https://doi.org/10.1080/02724980443000737

  19. Darwin CJ (1981) Perceptual grou** of speech components differing in fundamental frequency and onset-time. Q J Exp Psychol 24(4):185–207. https://doi.org/10.1080/14640748108400785

  20. Darwin CJ (2005) Simultaneous grou** and auditory continuity. Percept & Psychophys 67(8):1384–1390. https://doi.org/10.3758/BF03193643.

  21. Darwin CJ, Ciocca V (1992) Grou** in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component. J Acoust Soc Am 91(6):3381–3390. https://doi.org/10.1121/1.402828

  22. Denham SL, Winkler I (2015) Auditory perceptual organization. In: Jaeger D, Jung R (eds) Encyclopedia of computational neuro-science. Springer Science+Business Media Inc, New York, NY, pp 240–252

    Google Scholar 

  23. Donaldson MJ, Yamamoto N (2016) Detection of object onsets and offsets: does the primacy of onset persist even with bias for detecting offset?. Atten Percept Psychophys 78(7):1901–1915. https://doi.org/10.3758/s13414-016-1185-5

  24. Eggermont J (1969) Location of the syllable beat in routine scansion recitations of a dutch poem. IPO Annu Prog Rep 4:60–64

    Google Scholar 

  25. Gordon JW (1987) The perceptual attack time of musical tones. J Acoust Soc Am 82(1):88–105. https://doi.org/10.1121/1.395441

  26. Grassi M, Darwin CJ (2006) The subjective duration of ramped and damped sounds. Percept Psychophys 68(8):1382–1392. https://doi.org/10.3758/BF03193737

  27. Grassi M, Mioni G (2020) Why are damped sounds perceived as shorter than ramped sounds?. Atten Percept Psychophys 82(6):2775–2784. https://doi.org/10.3758/s13414-020-02059-2

  28. Green EJ (2019) A theory of perceptual objects. Philos Phenomenol Res 99(3):663–693. https://doi.org/10.1111/phpr.12521

  29. Gregg MK, Samuel AG (2012) Feature assignment in perception of auditory figure. J Exp Psychol Hum Percept Perform 38(4):998–1013. https://doi.org/10.1037/a0026789

  30. Gregory RL (1980) Perceptions as hypotheses. Philos Trans R Soc B Biol Sci 290(1038):181–197. https://doi.org/10.1098/rstb.1980.0090

  31. Griffiths TD, Warren JD (2004) What is an auditory object?. Nat Rev Neurosci 5(11):887–892. https://doi.org/10.1038/nrn1538

  32. Grondin S et al (2018) Auditory time perception. In: Bader R (ed) Springer handbook of systematic musiclology, Springer-Verlag GmbH Germany, Cham, Switzerland, Chap. 21, pp 423–440. https://doi.org/10.1007/978-3-662-55004-5_21

  33. Handel S (2019) Objects and events. Perceptual organization: an integrated multisensory approach. Palgrave Macmillan, Cham, Switzerland, pp 9–82. https://doi.org/10.1007/978-3-319-96337-2_2

  34. Hartmann WM, Doty SL (1996) On the pitches of the components of a complex tone. J Acoust Soc Am 99(1):567–578. https://doi.org/10.1121/1.414514

  35. Hartmann WM, McAdams S, Smith BK (1990) Hearing a mistuned harmonic in an otherwise periodic complex tone. J Acoust Soc Am 88(4):1712–1724. https://doi.org/10.1121/1.400246

  36. Heald SLM, Van Hedger SC, Nusbaum HC (2017) Perceptual plasticity for auditory object recognition. Front Psychol 8, Article 781, p 16. https://doi.org/10.3389/fpsyg.2017.00781

  37. Heil P (1997) Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol 77(5):2616–2641. https://doi.org/10.1152/jn.1997.77.5.2616

  38. Heil P (2003) Coding of temporal onset envelope in the auditory system. Speech Commun 41(1):123–134. https://doi.org/10.1016/S0167-6393(02)00099-7

  39. Holmes SD, Roberts B (2012) Pitch shifts on mistuned harmonics in the presence and absence of corresponding in-tune components. J Acoust Soc Am 132(3):1548–1560. https://doi.org/10.1121/1.4740487

  40. Houtsma AJ, Rossing TD, Wagenaars WM (1987) Auditory demonstrations. Institute for perception research (IPO), northern illinois university, Acoustical Society of America, Eindhoven, Netherlands. https://research.tue.nl/nl/publications/auditory-demonstrations

  41. Jones MR (1976) Time, our lost dimension: toward a new theory of perception, attention, and memory. Psychol Rev 83(5):323–335. https://doi.org/10.1037/0033-295X.83.5.323

  42. Koffka K (1955) Principles of gestalt psychology, 5th edn. Routledge, London, UK

    Google Scholar 

  43. Kubovy M, Van Valkenburg D (2001) Auditory and visual objects. Cogn 80(1):97–126. https://doi.org/10.1016/S0010-0277(00)00155-4

  44. Kuroda T, Grondin S (2013) No time-stretching illusion when a tone is followed by a noise. Atten Percept Psychophys 75(8):1811–1816. https://doi.org/10.3758/s13414-013-0536-8

  45. Marin CMH, McAdams S (1991) Segregation of concurrent sounds. II: Effects of spectral envelope tracing, frequency modulation coherence, and frequency modulation width. J Acoust Soc Am 89(1):341–351. https://doi.org/10.1121/1.400469

  46. Matthen M (2010) On the diversity of auditory objects. Rev Philos Psychol 1(1):63–89. https://doi.org/10.1007/s13164-009-0018-z

  47. Matthews WJ, Stewart N, Wearden JH (2011) Stimulus intensity and the perception of duration. J Exp Psychol Hum Percept Perform 37(1):303–313. https://doi.org/10.1037/a0019961

  48. McAdams S (1989) Segregation of concurrent sounds. I: Effects of frequency modulation coherence. J Acoust Soc Am 86(6):2148–2159. https://doi.org/10.1121/1.398475

  49. McAdams S, Drake C (2002) Auditory perception and cognition. In: Pashler H (ed) Stevens’ handbook of experimental psychology, volume 1: sensation and perception, 3rd edn. Wiley, New York, Chap. 10, pp 397–452. https://doi.org/10.1002/0471214426.pas0110

  50. McLachlan NM, Wilson S ( 2010) The central role of recognition in auditory perception: a neurobiological model. Psychol Rev 117(1):175–196. https://doi.org/10.1037/a0018063

  51. Micheyl C, Oxenham AJ (2010) Pitch, harmonicity and concurrent sound segregation: Psychoacoustical and neurophysiological findings. Hear Res 266(1-2):36–51. https://doi.org/10.1016/j.heares.2009.09.012

  52. Middlebrooks JC (2017) Spatial stream segregation. In: Middlebrooks JC et al (eds) The auditory system at the cocktail party. Springer International Publishing, Cham, Switzerland, Chap. 6, pp 137–168. https://doi.org/10.1007/978-3-319-51662-2_6

  53. Mill RW et al (2013) Modelling the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Comput Biol 9(3), e1002925, p 21. https://doi.org/10.1371/journal.pcbi.1002925

  54. Moore BC, Glasberg BR, Peters RW (1985) Relative dominance of individual partials in determining the pitch of complex tones. J Acoust Soc Am 77(5):1853–1860. https://doi.org/10.1121/1.391936

  55. Moore BC, Glasberg BR, Peters RW (1986) Thresholds for hearing mistuned partials as separate tones in harmonic complexes. J Acoust Soc Am 80(2):479–483. https://doi.org/10.1121/1.394043

  56. Moore BC, Gockel HE (2011) Resolvability of components in complex tones and implications for theories of pitch perception. Hear Res 276:88–97. https://doi.org/10.1016/j.heares.2011.01.003

    Article  Google Scholar 

  57. Moore BC, Ohgushi K (1993) Audibility of partials in inharmonic complex tones. J Acoust Soc Am 93(1):452–461. https://doi.org/10.1121/1.405625

  58. Moore BC, Peters RW, Glasberg BR (1985) Thresholds for the detection of inharmonicity in complex tones. J Acoust Soc Am 77(5):1861–1867. https://doi.org/10.1121/1.391937

  59. Näätänen R, Winkler I (1999) The concept of auditory stimulus representation in cognitive neuroscience. Psychol Bull 126(6):826–859. https://doi.org/10.1037/0033-2909.125.6.826

  60. Nakajima Y et al (2014) Auditory grammar. Acoust Aust 42(2):97–101

    Google Scholar 

  61. Nudds M (2010) What are auditory objects? Rev Philos Psychol 1:105–122. https://doi.org/10.1007/s13164-009-0003-6

    Article  Google Scholar 

  62. Peeters G et al (2011) The timbre toolbox: extracting audio descriptors form musical signals. J Acoust Soc Am 130(5):2902–2916. https://doi.org/10.1121/1.3642604

  63. Phillips DP, Hall SE, Boehnke SE (2002) Central auditory onset responses, and temporal asymmetries in auditory perception. Hear Res 167(1-2):192–205. https://doi.org/10.1016/S0378-5955(02)00393-3

  64. Plomp R (1964) The ear as frequency analyzer. J Acoust Soc Am 36(9):1628–1636. https://doi.org/10.1121/1.1919256

  65. Plomp R, Mimpen AM (1968) The ear as frequency analyzer. II. J Acoust Soc Am 43(4):764–767. https://doi.org/10.1121/1.1910894

  66. Plomp R (1998) Hoe wij Horen: over de Toon die de Muziek Maakt

    Google Scholar 

  67. Rasch RA (1979) Synchronization in performed ensemble music. Acta Acust United Acust 43(2):121–131

    Google Scholar 

  68. Rasch RA (1978) The perception of simultaneous notes such as in polyphonic music. Acta Acust United Acust 40(1):21–33

    Google Scholar 

  69. Roberts B (2005) Spectral pattern, grou**, and the pitches of complex tones and their components. Acta Acust United Acust 91(6):945-957

    Google Scholar 

  70. Roberts B, Bailey PJ (1996) Spectral regularity as a factor distinct from harmonic relations in auditory grou**. J Exp Psychol Hum Percept Perform 22(3):604–614. https://doi.org/10.1037/0096-1523.22.3.604

  71. Roberts B, Bregman AS (1991) Effects of the pattern of spectral spacing on the perceptual fusion of harmonics. J Acoust Soc Am 90(6):3050–3060. https://doi.org/10.1121/1.401779

  72. Roberts B, Brunstrom JM (2001) Perceptual fusion and fragmentation of complex tones made inharmonic by applying different degrees of frequency shift and spectral stretch. J Acoust Soc Am 110(5):2479–2490. https://doi.org/10.1121/1.1410965

  73. Roberts B, Brunstrom JM (2003) Spectral pattern, harmonic relations, and the perceptual grou** of low-numbered components. J Acoust Soc Am 114(4):2118–2134. https://doi.org/10.1121/1.1605411

  74. Roberts B, Holmes SD (2006) Grou** and the pitch of a mistuned fundamental component: effects of applying simultaneous multiple mistunings to the other harmonics. Hear Res 222:79–88. https://doi.org/10.1016/j.heares.2006.08.013

    Article  Google Scholar 

  75. Sasaki T, Nakajima Y, Hoopen G ten (1993) The effect of a preceding neighbortone on the perception of filled durations. In: Proceedings of the Spring Meeting of the Acoustical Society of Japan. pp 347–348

    Google Scholar 

  76. Sasaki T et al (2010) Time stretching: illusory lengthening of filled auditory durations. Atten Percept Psychophys 72 (5):1404–1421. https://doi.org/10.3758/APP.72.5.1404

  77. Schlauch RS, Ries DT, DiGiovanni JJ (2001) Duration discrimination and subjective duration for ramped and damped sounds. J Acoust Soc Am 109(6):2880–2887. https://doi.org/10.1121/1.1372913

  78. Shamma SA, Elhilali M, Micheyl C (2011) Temporal coherence and attention in auditory scene analysis. Trends Neurosci 34 (3):114–123. https://doi.org/10.1016/j.tins.2010.11.002

  79. Shamma SA et al (2013) Temporal coherence and the streaming of complex sounds. In: Moore BC et al (eds) Basic aspects of hearing: physiology and perception. Springer Science+Business Media, New York, Chap. 59, pp 535–543. https://doi.org/10.1007/978-1-4614-1590-9_59

  80. Shinn-Cunningham BG, Lee AK, Oxenham AJ (2007) A sound element gets lost in perceptual competition. In: Proceedings of the National Academy of Sciences. vol 104 (29), pp 12223–12227. https://doi.org/10.1073/pnas.0704641104

  81. Snyder JS et al (2012) Attention, awareness, and the perception of auditory scenes. Front Psychol 3, Article 15, p 17. https://doi.org/10.3389/fpsyg.2012.00015

  82. Summerfield Q et al (1984) Perceiving vowels from uniform spectra: phonetic exploration of an auditory aftereffect. Percept Psychophys 35(3):203–213. https://doi.org/10.3758/BF03205933

  83. Van Katwijk A, Van der Burg B (1968) Perceptual and motoric synchronisation with syllable beats. IPO Annu Prog Rep 3:35–39

    Google Scholar 

  84. Verwulgen S et al (2020) On the perception of disharmony. In: Ahram T et al (eds) Integrating people and intelligent systems: proceedings of the 3rd international conference on intelligent human systems integration (IHSI 2020). Springer Nature Switzerland AG, Cham, Switzerland, pp 195–200. https://doi.org/10.1007/978-3-030-39512-4_31

  85. Villing RC et al (2011) Measuring perceptual centers using the phase correction response. Atten Percept Psychophys 73(5):1614–1629. https://doi.org/10.3758/s13414-011-0110-1

  86. Wagemans J et al (2012) A century of gestalt psychology in visual perception. Psychol Bull 138(6):1172–1217. https://doi.org/10.1037/a0029334

  87. Wagemans J et al (2012) A century of gestalt psychology in visual perception: II. conceptual and theoretical foundations. Psychol Bull 138(6):1218–1252. https://doi.org/10.1037/a0029333

  88. Warren RM (1999) Auditory perception: a new synthesis. Cambridge University Press, Cambridge, UK

    Google Scholar 

  89. Wearden, JH et al (2007) Internal clock processes and the filled-duration illusion. J Exp Psychol Hum Percept Perform 33(3):716–729. https://doi.org/10.1037/0096-1523.33.3.716

  90. Wertheimer M (1923) Untersuchungen zur lehre von der gestalt. II. Psychol Forsch 4(1):301–350

    Google Scholar 

  91. Whalen DH, Cooper AM, Fowler CA (1989) P-center judgments are generally insensitive to the instructions given. Phonetica 46(4):197–203. https://doi.org/10.1159/000261843

  92. Williams SM (1994) Perceptual principles in sound grou**. Auditory display: sonification, audification and auditory interfaces. MA: Addison-Wesley Publishing Company, pp 95–125

    Google Scholar 

  93. Winkler I, Denham SL, Nelken I (2009) Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn Sci 13(12):532–540. https://doi.org/10.1016/j.tics.2009.09.003

  94. Zacks JM, Tversky B (2001) Event structure in perception and conception. Psychol Bull 1:3–21. https://doi.org/10.1037/0033-2909.127.1.3

    Article  Google Scholar 

  95. Zeng F-G et al (2005) Speech recognition with amplitude and frequency modulations. Proc Natl Acad Sci USA 102(7):2293–2298. https://doi.org/10.1073/pnas.0406460102

  96. Zwicker E (1969) Subjektive und objektive Dauer von Schallimpulsen und Schallpausen. Acustica 22(4):214–218

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dik J. Hermes .

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hermes, D.J. (2023). Auditory-Unit Formation. In: The Perceptual Structure of Sound. Current Research in Systematic Musicology, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-031-25566-3_4

Download citation

Publish with us

Policies and ethics

Navigation