Abstract
The previous chapters described how incoming acoustic information is distributed over a large array of auditory filters. This information generally does not originate from one sound source, but can come from multiple sound sources, such as at a cocktail party, in a bustling restaurant, or on a market place. In these circumstances, the auditory system must try to figure out which sound components come from which sound source, and reassemble the auditory information is such a way that the listener can meaningfully interpret the auditory environment in terms of auditory events such as footsteps, closing doors, human speech, musical melodies, etc., etc. This is called auditory scene analysis. In this chapter, the first stage of this process will be described, the formation of auditory units. An auditory unit is a sound to which the auditory system attributes one perceived moment of occurrence indicated by its beat. In spoken speech, the auditory units will in general be identified with the syllables that constitute the speech. It will be argued that their beats are formed by the clusters of onsets defined by the first consonants and the vowel of that syllable. In music, the separate tones of the melodies are the auditory units. Their beats are naturally formed by the onsets of the frequency components that make up the tones. Since every auditory unit has a beat, auditory units can be counted, which is why they are called units. This chapter describes the organizing principles operating in the process of auditory-unit formation. It is argued that the attributes of the auditory units such as their timbre, their loudness, and their pitch, emerge from the process of auditory unit formation, and are thus defined by the auditory information that contributes to the formation of these auditory units.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barbosa PA, Bailly G (1994) Characterisation of rhythmic patterns for text-to-speech synthesis. Speech Commun 15(1–2):127–137. https://doi.org/10.1016/0167-6393(94)90047-7.
Bizley JK, Cohen YE (2013) The what, where and how of auditory-object perception. Nat Rev Neurosci 14(10):693–707. https://doi.org/10.1038/nrn3565.
Blauert J (1997) Spatial hearing: the psychophysics of human sound localization, Revised. MIT Press, Cambridge, MA
Bregman AS (1990) Auditory scene analysis: the perceptual organization of sound. MIT Press, Cambridge, MA
Bregman AS (2008) Rhythms emerge from the perceptual grou** of acoustic components. In: Proceedings of Fechner Day, vol 24 (1), pp 13–16. http://proceedings.fechnerday.com/index.php/proceedings/article/view/163
Bregman AS, Ahad PA (1996) Demonstrations of scene analysis: the perceptual organization of sound, Montreal, Canada. http://webpages.mcgill.ca/staff/Group2/abregm1/web/downloadsdl.htm
Bregman AS, Ahad PA, Kim J (1994) Resetting the pitch-analysis system. 2. Role of sudden onsets and offsets in the perception of individual components in a cluster of overlap** tones. J Acoust Soc Am 96(5):2694–2703. https://doi.org/10.1121/1.411277
Bregman AS, Campbell J (1971) Primary auditory stream segregation and perception of order in rapid sequences of tones. J Exp Psychol 89(2):244–249. https://doi.org/10.1037/h0031163
Brooks JL (2015) Traditional and new principles of perceptual grou**. In: Wagemans J (ed) The oxford handbook of perceptual organization, Oxford University Press, Oxford, UK, Chap. 4, p 31. https://kar.kent.ac.uk/35324/1/Brooks-Grou**Chapter-OUPHandbook-REPOSITORY.pdf
Burghardt H (1973) Die subjektive dauer schmalbandiger schalle bei verschiedenen frequenzlagen. Acust 28(5):278–284
Cabrera D, Pop C, Jeong D (2006) Auditory room size perception: a comparison of real versus binaural sound-fields. In: Proceedings of the 1st Australasian acoustic societies’ conference (Acoustics 2000), Christchurch, New Zealand, pp 417–422. https://www.acoustics.asn.au/conference_proceedings/AASNZ2006/papers/p107.pdf
Carlyon RP (2004) How the brain separates sounds. Trends Cogn Sci 8(10):465–471. https://doi.org/10.1016/j.tics.2004.08.008
Carlyon RP, Gockel HE (2008) Effects of harmonicity and regularity on the perception of sound sources. In: Yost WA, Popper AN, Fay RR (eds) Auditory perception of sound sources, Springer Science+Business Media, New York, Chap. 7, pp 191–213. https://doi.org/10.1007/978-0-387-71305-2_7
Carlyon RP et al (2009) Changes in the perceived duration of a narrowband sound induced by a preceding stimulus. J Exp Psychol Hum Percept Perform 35(6):1898–1912. https://doi.org/10.1037/a0015018
Chen L (2019) Discrimination of empty and filled intervals marked by auditory signals with different durations and directions of intensity change. PsyCh J 8(2):187–202. https://doi.org/10.1002/pchj.267
Ciocca V (1999) Evidence against an effect of grou** by spectral regularity on the perception of virtual pitch. J Acoust Soc Am 106(5):2746–2751. https://doi.org/10.1121/1.428102
Craig JC (1973) A constant error in the perception of brief temporal intervals. Percept & Psychophys 13(1):99–104. https://doi.org/10.3758/BF03207241.
Crum PAC, Bregman AS (2006) Effects of unit formation on the perception of a changing sound. Q J Exp Psychol 59(3):543–556. https://doi.org/10.1080/02724980443000737
Darwin CJ (1981) Perceptual grou** of speech components differing in fundamental frequency and onset-time. Q J Exp Psychol 24(4):185–207. https://doi.org/10.1080/14640748108400785
Darwin CJ (2005) Simultaneous grou** and auditory continuity. Percept & Psychophys 67(8):1384–1390. https://doi.org/10.3758/BF03193643.
Darwin CJ, Ciocca V (1992) Grou** in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component. J Acoust Soc Am 91(6):3381–3390. https://doi.org/10.1121/1.402828
Denham SL, Winkler I (2015) Auditory perceptual organization. In: Jaeger D, Jung R (eds) Encyclopedia of computational neuro-science. Springer Science+Business Media Inc, New York, NY, pp 240–252
Donaldson MJ, Yamamoto N (2016) Detection of object onsets and offsets: does the primacy of onset persist even with bias for detecting offset?. Atten Percept Psychophys 78(7):1901–1915. https://doi.org/10.3758/s13414-016-1185-5
Eggermont J (1969) Location of the syllable beat in routine scansion recitations of a dutch poem. IPO Annu Prog Rep 4:60–64
Gordon JW (1987) The perceptual attack time of musical tones. J Acoust Soc Am 82(1):88–105. https://doi.org/10.1121/1.395441
Grassi M, Darwin CJ (2006) The subjective duration of ramped and damped sounds. Percept Psychophys 68(8):1382–1392. https://doi.org/10.3758/BF03193737
Grassi M, Mioni G (2020) Why are damped sounds perceived as shorter than ramped sounds?. Atten Percept Psychophys 82(6):2775–2784. https://doi.org/10.3758/s13414-020-02059-2
Green EJ (2019) A theory of perceptual objects. Philos Phenomenol Res 99(3):663–693. https://doi.org/10.1111/phpr.12521
Gregg MK, Samuel AG (2012) Feature assignment in perception of auditory figure. J Exp Psychol Hum Percept Perform 38(4):998–1013. https://doi.org/10.1037/a0026789
Gregory RL (1980) Perceptions as hypotheses. Philos Trans R Soc B Biol Sci 290(1038):181–197. https://doi.org/10.1098/rstb.1980.0090
Griffiths TD, Warren JD (2004) What is an auditory object?. Nat Rev Neurosci 5(11):887–892. https://doi.org/10.1038/nrn1538
Grondin S et al (2018) Auditory time perception. In: Bader R (ed) Springer handbook of systematic musiclology, Springer-Verlag GmbH Germany, Cham, Switzerland, Chap. 21, pp 423–440. https://doi.org/10.1007/978-3-662-55004-5_21
Handel S (2019) Objects and events. Perceptual organization: an integrated multisensory approach. Palgrave Macmillan, Cham, Switzerland, pp 9–82. https://doi.org/10.1007/978-3-319-96337-2_2
Hartmann WM, Doty SL (1996) On the pitches of the components of a complex tone. J Acoust Soc Am 99(1):567–578. https://doi.org/10.1121/1.414514
Hartmann WM, McAdams S, Smith BK (1990) Hearing a mistuned harmonic in an otherwise periodic complex tone. J Acoust Soc Am 88(4):1712–1724. https://doi.org/10.1121/1.400246
Heald SLM, Van Hedger SC, Nusbaum HC (2017) Perceptual plasticity for auditory object recognition. Front Psychol 8, Article 781, p 16. https://doi.org/10.3389/fpsyg.2017.00781
Heil P (1997) Auditory cortical onset responses revisited. I. First-spike timing. J Neurophysiol 77(5):2616–2641. https://doi.org/10.1152/jn.1997.77.5.2616
Heil P (2003) Coding of temporal onset envelope in the auditory system. Speech Commun 41(1):123–134. https://doi.org/10.1016/S0167-6393(02)00099-7
Holmes SD, Roberts B (2012) Pitch shifts on mistuned harmonics in the presence and absence of corresponding in-tune components. J Acoust Soc Am 132(3):1548–1560. https://doi.org/10.1121/1.4740487
Houtsma AJ, Rossing TD, Wagenaars WM (1987) Auditory demonstrations. Institute for perception research (IPO), northern illinois university, Acoustical Society of America, Eindhoven, Netherlands. https://research.tue.nl/nl/publications/auditory-demonstrations
Jones MR (1976) Time, our lost dimension: toward a new theory of perception, attention, and memory. Psychol Rev 83(5):323–335. https://doi.org/10.1037/0033-295X.83.5.323
Koffka K (1955) Principles of gestalt psychology, 5th edn. Routledge, London, UK
Kubovy M, Van Valkenburg D (2001) Auditory and visual objects. Cogn 80(1):97–126. https://doi.org/10.1016/S0010-0277(00)00155-4
Kuroda T, Grondin S (2013) No time-stretching illusion when a tone is followed by a noise. Atten Percept Psychophys 75(8):1811–1816. https://doi.org/10.3758/s13414-013-0536-8
Marin CMH, McAdams S (1991) Segregation of concurrent sounds. II: Effects of spectral envelope tracing, frequency modulation coherence, and frequency modulation width. J Acoust Soc Am 89(1):341–351. https://doi.org/10.1121/1.400469
Matthen M (2010) On the diversity of auditory objects. Rev Philos Psychol 1(1):63–89. https://doi.org/10.1007/s13164-009-0018-z
Matthews WJ, Stewart N, Wearden JH (2011) Stimulus intensity and the perception of duration. J Exp Psychol Hum Percept Perform 37(1):303–313. https://doi.org/10.1037/a0019961
McAdams S (1989) Segregation of concurrent sounds. I: Effects of frequency modulation coherence. J Acoust Soc Am 86(6):2148–2159. https://doi.org/10.1121/1.398475
McAdams S, Drake C (2002) Auditory perception and cognition. In: Pashler H (ed) Stevens’ handbook of experimental psychology, volume 1: sensation and perception, 3rd edn. Wiley, New York, Chap. 10, pp 397–452. https://doi.org/10.1002/0471214426.pas0110
McLachlan NM, Wilson S ( 2010) The central role of recognition in auditory perception: a neurobiological model. Psychol Rev 117(1):175–196. https://doi.org/10.1037/a0018063
Micheyl C, Oxenham AJ (2010) Pitch, harmonicity and concurrent sound segregation: Psychoacoustical and neurophysiological findings. Hear Res 266(1-2):36–51. https://doi.org/10.1016/j.heares.2009.09.012
Middlebrooks JC (2017) Spatial stream segregation. In: Middlebrooks JC et al (eds) The auditory system at the cocktail party. Springer International Publishing, Cham, Switzerland, Chap. 6, pp 137–168. https://doi.org/10.1007/978-3-319-51662-2_6
Mill RW et al (2013) Modelling the emergence and dynamics of perceptual organisation in auditory streaming. PLoS Comput Biol 9(3), e1002925, p 21. https://doi.org/10.1371/journal.pcbi.1002925
Moore BC, Glasberg BR, Peters RW (1985) Relative dominance of individual partials in determining the pitch of complex tones. J Acoust Soc Am 77(5):1853–1860. https://doi.org/10.1121/1.391936
Moore BC, Glasberg BR, Peters RW (1986) Thresholds for hearing mistuned partials as separate tones in harmonic complexes. J Acoust Soc Am 80(2):479–483. https://doi.org/10.1121/1.394043
Moore BC, Gockel HE (2011) Resolvability of components in complex tones and implications for theories of pitch perception. Hear Res 276:88–97. https://doi.org/10.1016/j.heares.2011.01.003
Moore BC, Ohgushi K (1993) Audibility of partials in inharmonic complex tones. J Acoust Soc Am 93(1):452–461. https://doi.org/10.1121/1.405625
Moore BC, Peters RW, Glasberg BR (1985) Thresholds for the detection of inharmonicity in complex tones. J Acoust Soc Am 77(5):1861–1867. https://doi.org/10.1121/1.391937
Näätänen R, Winkler I (1999) The concept of auditory stimulus representation in cognitive neuroscience. Psychol Bull 126(6):826–859. https://doi.org/10.1037/0033-2909.125.6.826
Nakajima Y et al (2014) Auditory grammar. Acoust Aust 42(2):97–101
Nudds M (2010) What are auditory objects? Rev Philos Psychol 1:105–122. https://doi.org/10.1007/s13164-009-0003-6
Peeters G et al (2011) The timbre toolbox: extracting audio descriptors form musical signals. J Acoust Soc Am 130(5):2902–2916. https://doi.org/10.1121/1.3642604
Phillips DP, Hall SE, Boehnke SE (2002) Central auditory onset responses, and temporal asymmetries in auditory perception. Hear Res 167(1-2):192–205. https://doi.org/10.1016/S0378-5955(02)00393-3
Plomp R (1964) The ear as frequency analyzer. J Acoust Soc Am 36(9):1628–1636. https://doi.org/10.1121/1.1919256
Plomp R, Mimpen AM (1968) The ear as frequency analyzer. II. J Acoust Soc Am 43(4):764–767. https://doi.org/10.1121/1.1910894
Plomp R (1998) Hoe wij Horen: over de Toon die de Muziek Maakt
Rasch RA (1979) Synchronization in performed ensemble music. Acta Acust United Acust 43(2):121–131
Rasch RA (1978) The perception of simultaneous notes such as in polyphonic music. Acta Acust United Acust 40(1):21–33
Roberts B (2005) Spectral pattern, grou**, and the pitches of complex tones and their components. Acta Acust United Acust 91(6):945-957
Roberts B, Bailey PJ (1996) Spectral regularity as a factor distinct from harmonic relations in auditory grou**. J Exp Psychol Hum Percept Perform 22(3):604–614. https://doi.org/10.1037/0096-1523.22.3.604
Roberts B, Bregman AS (1991) Effects of the pattern of spectral spacing on the perceptual fusion of harmonics. J Acoust Soc Am 90(6):3050–3060. https://doi.org/10.1121/1.401779
Roberts B, Brunstrom JM (2001) Perceptual fusion and fragmentation of complex tones made inharmonic by applying different degrees of frequency shift and spectral stretch. J Acoust Soc Am 110(5):2479–2490. https://doi.org/10.1121/1.1410965
Roberts B, Brunstrom JM (2003) Spectral pattern, harmonic relations, and the perceptual grou** of low-numbered components. J Acoust Soc Am 114(4):2118–2134. https://doi.org/10.1121/1.1605411
Roberts B, Holmes SD (2006) Grou** and the pitch of a mistuned fundamental component: effects of applying simultaneous multiple mistunings to the other harmonics. Hear Res 222:79–88. https://doi.org/10.1016/j.heares.2006.08.013
Sasaki T, Nakajima Y, Hoopen G ten (1993) The effect of a preceding neighbortone on the perception of filled durations. In: Proceedings of the Spring Meeting of the Acoustical Society of Japan. pp 347–348
Sasaki T et al (2010) Time stretching: illusory lengthening of filled auditory durations. Atten Percept Psychophys 72 (5):1404–1421. https://doi.org/10.3758/APP.72.5.1404
Schlauch RS, Ries DT, DiGiovanni JJ (2001) Duration discrimination and subjective duration for ramped and damped sounds. J Acoust Soc Am 109(6):2880–2887. https://doi.org/10.1121/1.1372913
Shamma SA, Elhilali M, Micheyl C (2011) Temporal coherence and attention in auditory scene analysis. Trends Neurosci 34 (3):114–123. https://doi.org/10.1016/j.tins.2010.11.002
Shamma SA et al (2013) Temporal coherence and the streaming of complex sounds. In: Moore BC et al (eds) Basic aspects of hearing: physiology and perception. Springer Science+Business Media, New York, Chap. 59, pp 535–543. https://doi.org/10.1007/978-1-4614-1590-9_59
Shinn-Cunningham BG, Lee AK, Oxenham AJ (2007) A sound element gets lost in perceptual competition. In: Proceedings of the National Academy of Sciences. vol 104 (29), pp 12223–12227. https://doi.org/10.1073/pnas.0704641104
Snyder JS et al (2012) Attention, awareness, and the perception of auditory scenes. Front Psychol 3, Article 15, p 17. https://doi.org/10.3389/fpsyg.2012.00015
Summerfield Q et al (1984) Perceiving vowels from uniform spectra: phonetic exploration of an auditory aftereffect. Percept Psychophys 35(3):203–213. https://doi.org/10.3758/BF03205933
Van Katwijk A, Van der Burg B (1968) Perceptual and motoric synchronisation with syllable beats. IPO Annu Prog Rep 3:35–39
Verwulgen S et al (2020) On the perception of disharmony. In: Ahram T et al (eds) Integrating people and intelligent systems: proceedings of the 3rd international conference on intelligent human systems integration (IHSI 2020). Springer Nature Switzerland AG, Cham, Switzerland, pp 195–200. https://doi.org/10.1007/978-3-030-39512-4_31
Villing RC et al (2011) Measuring perceptual centers using the phase correction response. Atten Percept Psychophys 73(5):1614–1629. https://doi.org/10.3758/s13414-011-0110-1
Wagemans J et al (2012) A century of gestalt psychology in visual perception. Psychol Bull 138(6):1172–1217. https://doi.org/10.1037/a0029334
Wagemans J et al (2012) A century of gestalt psychology in visual perception: II. conceptual and theoretical foundations. Psychol Bull 138(6):1218–1252. https://doi.org/10.1037/a0029333
Warren RM (1999) Auditory perception: a new synthesis. Cambridge University Press, Cambridge, UK
Wearden, JH et al (2007) Internal clock processes and the filled-duration illusion. J Exp Psychol Hum Percept Perform 33(3):716–729. https://doi.org/10.1037/0096-1523.33.3.716
Wertheimer M (1923) Untersuchungen zur lehre von der gestalt. II. Psychol Forsch 4(1):301–350
Whalen DH, Cooper AM, Fowler CA (1989) P-center judgments are generally insensitive to the instructions given. Phonetica 46(4):197–203. https://doi.org/10.1159/000261843
Williams SM (1994) Perceptual principles in sound grou**. Auditory display: sonification, audification and auditory interfaces. MA: Addison-Wesley Publishing Company, pp 95–125
Winkler I, Denham SL, Nelken I (2009) Modeling the auditory scene: predictive regularity representations and perceptual objects. Trends Cogn Sci 13(12):532–540. https://doi.org/10.1016/j.tics.2009.09.003
Zacks JM, Tversky B (2001) Event structure in perception and conception. Psychol Bull 1:3–21. https://doi.org/10.1037/0033-2909.127.1.3
Zeng F-G et al (2005) Speech recognition with amplitude and frequency modulations. Proc Natl Acad Sci USA 102(7):2293–2298. https://doi.org/10.1073/pnas.0406460102
Zwicker E (1969) Subjektive und objektive Dauer von Schallimpulsen und Schallpausen. Acustica 22(4):214–218
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Hermes, D.J. (2023). Auditory-Unit Formation. In: The Perceptual Structure of Sound. Current Research in Systematic Musicology, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-031-25566-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-25566-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25565-6
Online ISBN: 978-3-031-25566-3
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)