Log in

Audiovisual perception of interrupted speech by nonnative listeners

  • Published:
Attention, Perception, & Psychophysics Aims and scope Submit manuscript

Abstract

The purpose of the present study was to examine the influence of visual cues in audiovisual perception of interrupted speech by nonnative English listeners and to identify the role of working memory, long-term memory retrieval, and vocabulary knowledge in audiovisual perception by nonnative listeners. The participants included 31 Mandarin-speaking English learners between 19 and 41 years of age. The perceptual stimuli were noise-filled periodically interrupted AzBio and QuickSIN sentences with or without visual cues that showed a male speaker uttering the sentences. In addition to sentence recognition, the listeners completed a semantic fluency task, verbal (operation span) and visuospatial (symmetry span) working memory tasks, and two vocabulary knowledge tests (Vocabulary Level Test and Lexical Test for Advanced Learners of English). The results revealed significantly better speech recognition in the audio-visual condition than the audio-only condition, but the magnitude of visual benefit was substantially attenuated for sentences that had limited semantic context. The listeners’ vocabulary size in English played a key role in the restoration of missing speech information and audiovisual integration in the perception of interrupted speech. Meanwhile, the listeners’ verbal working memory capacity played an important role in audiovisual integration especially for the difficult stimuli with limited semantic context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Code availability

Not applicable

References

  • Bashford, J. A., Warren, R. M., & Brown, C. A. (1996). Use of speech-modulated noise adds strong “bottom-up” cues for phonemic restoration. Perception & Psychophysics, 58(3), 342–350.

    Article  Google Scholar 

  • Bradlow, A. R., & Alexander, J. A. (2007). Semantic and phonetic enhancements for speech-in-noise recognition by native and nonnative listeners. The Journal of the Acoustical Society of America, 121(4), 2339–2349.

    Article  PubMed  Google Scholar 

  • Bradlow, A. R., & Bent, T. (2002). The clear speech effect for nonnative listeners. Journal of the Acoustical Society of America, 112(1), 272–284.

    Article  PubMed  Google Scholar 

  • Bundgaard-Nielsen, R. L., Best, C. T., & Tyler, M. D. (2011). Vocabulary size matters: The assimilation of second-language Australian English vowels to first-language Japanese vowel categories. Applied Psycholinguistics, 32(1), 51–67.

    Article  Google Scholar 

  • Burfin, S., Pascalis, O., Ruiz Tada, E., Costa, A., Savariaux, C., & Kandel, S. (2014). Bilingualism affects audiovisual phoneme identification. Frontiers in Psychology, 5, 1179.

    Article  PubMed  PubMed Central  Google Scholar 

  • Campbell, R. (2008). The processing of audio-visual speech: Empirical and neural bases. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1493), 1001–1010.

    Article  Google Scholar 

  • Cebrian, J., & Carlet, A. (2012). Audiovisual perception of native and nonnative sounds by native and nonnative speakers. In S. Martin Alegre, M. Moyer, E. Pladevall, & S. Tubau (Eds.), At a time of crisis: English and American studies in Spain (pp. 300–307). Universitat Autònoma de Barcelona.

    Google Scholar 

  • Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLOS Computational Biology, 5, e1000436.

    Article  PubMed  PubMed Central  Google Scholar 

  • Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213–238.

    Article  Google Scholar 

  • Cunillera, T., Càmara, E., Laine, M., & Rodríguez-Fornells, A. (2010). Speech segmentation is facilitated by visual cues. The Quarterly Journal of Experimental Psychology, 63(2), 260–274.

    Article  PubMed  Google Scholar 

  • Cutler, A., Cooke, M., Garcia-Lecumberri, M. L., & Pasveer, D. (2007). L2 consonant identification in noise: Cross-language comparisons. INTERSPEECH 2007: Proceedings of the 8th Annual Conference of the International Speech Communication Association (pp. 1585–1588). ISCA.

    Chapter  Google Scholar 

  • Cutler, A., Weber, A., Smits, R., & Cooper, N. (2004). Patterns of English phoneme confusions by native and nonnative listeners. The Journal of the Acoustical Society of America, 116(6), 3668–3678.

    Article  PubMed  Google Scholar 

  • Davis, C., & Kim, J. (2004). Audio-visual interactions with intact clearly audible speech. Quarterly Journal of Experimental Psychology, 57, 1103–1121.

    Article  Google Scholar 

  • Drijvers, L., & Özyürek, A. (2017). Visual context enhanced: The joint contribution of iconic gestures and visible speech to degraded speech comprehension. Journal of Speech, Language, and Hearing Research, 60(1), 212–222.

    Article  PubMed  Google Scholar 

  • Drijvers, L., & Özyürek, A. (2018). Native language status of the listener modulates the neural integration of speech and iconic gestures in clear and adverse listening conditions. Brain and Language, 177, 7–17.

    Article  PubMed  Google Scholar 

  • Drijvers, L., & Özyürek, A. (2020). nonnative listeners benefit less from gestures and visible speech than native listeners during degraded speech comprehension. Language and Speech, 63(2), 209–220.

    Article  PubMed  Google Scholar 

  • Exton, E., & Newman, R. (2023). The impact of language background and listening-in-noise on the phonemic restoration effect. PsyAr**v Preprint. https://doi.org/10.31234/osf.io/p8nq4

  • Foo, C., Rudner, M., Rönnberg, J., & Lunner, T. (2007). Recognition of speech in noise with new hearing instrument compression release settings requires explicit cognitive storage and processing capacity. Journal of the American Academy of Audiology, 18, 618–631.

    Article  PubMed  Google Scholar 

  • Füllgrabe, C., & Rosen, S. (2016). On the (un)importance of working memory in speech-in-noise processing for listeners with normal hearing thresholds. Frontiers in Psychology, 7, 196991.

  • Grant, K. W., & Seitz, P. F. (1998). Measures of auditory–visual integration in nonsense syllables and sentences. The Journal of the Acoustical Society of America, 104(4), 2438–2450.

    Article  PubMed  Google Scholar 

  • Grossberg, S., & Kazerounian, S. (2011). Laminar cortical dynamics of conscious speech perception: Neural model of phonemic restoration using subsequent context in noise. The Journal of the Acoustical Society of America, 130(1), 440–460.

    Article  PubMed  Google Scholar 

  • Hannah, B., Wang, Y., Jongman, A., & Sereno, J. A. (2016). Cross-modal association between auditory and visual-spatial information in Mandarin tone perception. The Journal of the Acoustical Society of America, 140(4), 3225–3225.

    Article  Google Scholar 

  • Hardison, D. M. (1996). Bimodal speech perception by native and nonnative speakers of English: Factors influencing the McGurk effect. Language Learning, 46(1), 3–73.

    Article  Google Scholar 

  • Hardison, D. M. (2003). Acquisition of second-language speech: Effects of visual cues, context, and talker variability. Journal of Applied Psychology, 24, 495–522.

    Google Scholar 

  • Hazan, V., Sennema, A., Iba, M., & Faulkner, A. (2005). Effect of audiovisual perceptual training on the perception and production of consonants by Japanese learners of English. Speech Communication, 47, 360–378.

    Article  Google Scholar 

  • Hazan, V., Sennema, A., Faulkner, A., Ortega-Llebaria, M., Iba, M., & Chung, H. (2006). The use of visual cues in the perception of nonnative consonant contrasts. The Journal of the Acoustical Society of America, 119, 1740–1751.

    Article  PubMed  Google Scholar 

  • Hazan, V., Kim, J., & Chen, Y. (2010). Audiovisual perception in adverse conditions: Language, speaker and listener effects. Speech Communication, 52(11/12), 996–1009.

    Article  Google Scholar 

  • Hedden, T., Lautenschlager, G., & Park, D. C. (2005). Contributions of processing ability and knowledge to verbal memory tasks across the adult life-span. The Quarterly Journal of Experimental Psychology A, 58, 169–190.

    Article  Google Scholar 

  • Inceoglu, S. (2019). Individual differences in L2 speech perception: The role of phonological memory and lipreading ability. The Modern Language Journal, 103(4), 782–799.

    Article  Google Scholar 

  • Ishida, M. (2021). Perceptual restoration of locally time-reversed speech: Nonnative listeners’ performance in their L2 vs. L1. Attention, Perception, & Psychophysics, 83(6), 2675–2693.

    Article  Google Scholar 

  • Ishida, M., & Arai, T. (2016). Missing phonemes are perceptually restored but differently by native and nonnative listeners. SpringerPlus, 5, 713.

    Article  PubMed  PubMed Central  Google Scholar 

  • Juffs, A., & Harrington, M. (2011). Aspects of working memory in L2 learning. Language Teaching, 44(2), 137–166.

    Article  Google Scholar 

  • Karakoç, D., & Köse, G. D. (2017). The impact of vocabulary knowledge on reading, writing and proficiency scores of EFL learners. Journal of Language and Linguistic Studies, 13(1), 352–378.

    Google Scholar 

  • Killion, M. C., Niquette, P. A., Gudmundsen, G. I., Revit, L. J., & Banerjee, S. (2004). Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 116(4), 2395–2405.

    Article  PubMed  Google Scholar 

  • Klein, K. E., Walker, E. A., Kirby, B., & McCreery, R. W. (2017). Vocabulary facilitates speech perception in children with hearing aids. Journal of Speech, Language, and Hearing Research, 60(8), 2281–2296.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lansing, C. R., & McConkie, G. W. (1999). Attention to facial regions in segmental and prosodic visual speech perception tasks. Journal of Speech, Language, and Hearing Research, 42(3), 526–539.

    Article  PubMed  Google Scholar 

  • Lemhöfer, K., & Broersma, M. (2012). Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behavior Research Methods, 44, 325–343.

    Article  PubMed  Google Scholar 

  • Linck, J. A., Osthus, P., Koeth, J. T., & Bunting, M. F. (2014). Working memory and second language comprehension and production: A meta-analysis. Psychonomic Bulletin & Review, 21(4), 861–883.

    Article  Google Scholar 

  • Lunner, T., & Sundewall-Thorén, E. (2007). Interactions between cognition, compression, and listening condition: Effects on speech-in-noise performance in a two-channel hearing aid. Journal of the American Academy of Audiology, 18(7), 604–617.

    Article  PubMed  Google Scholar 

  • Marchman, V. A., Fernald, A., & Hurtado, N. (2010). How vocabulary size in two languages relates to efficiency in spoken word recognition by young Spanish-English bilinguals. Journal of Child Language, 37(4), 817–840.

    Article  PubMed  Google Scholar 

  • Mayo, L. H., Florentine, M., & Buus, S. (1997). Age of second-language acquisition and perception of speech in noise. Journal of Speech, Language, and Hearing Research, 40(3), 686–693.

    Article  PubMed  Google Scholar 

  • Michael, E. B., & Gollan, T. (2005). Being and becoming bilingual: Individual differences and consequences for language production. In J. F. Kroll & A. M. B. De Groot (Eds.), Handbook of bilingualism: Psycholinguistic approaches (pp. 398–408). Oxford University Press.

    Google Scholar 

  • Milton, J. (2013). Measuring the contribution of vocabulary knowledge to proficiency in the four skills. In C. Bardel, C. Lindqvist, & B. Laufer (Eds.), L2 Vocabulary acquisition knowledge and use new perspectives on assessment and corpus analysis (pp. 57–78). Eurosla Monograph Series 2.

    Google Scholar 

  • Mohd Nasir, N. F. W., Ab Manan, N. A., & Azizan, N. (2017). Examining the relationship between vocabulary knowledge and general English language proficiency. ESTEEM Journal of Social Sciences and Humanities, 1, 15–22.

    Google Scholar 

  • Munro, M. J., & Derwing, T. M. (1995). Processing time, accent, and comprehensibility in the perception of native and foreign-accented speech. Language and Speech, 38(3), 289–306.

    Article  PubMed  Google Scholar 

  • Munson, B. (2001). Relationships between vocabulary size and spoken word recognition in children aged 3 to 70. Contemporary Issues in Communication Science and Disorders, 28(Spring), 20–29.

    Article  Google Scholar 

  • Nagaraj, N. K., & Knapp, A. N. (2015). No evidence of relation between working memory and perception of interrupted speech in young adults. The Journal of the Acoustical Society of America, 138(2), EL145–EL150.

    Article  PubMed  Google Scholar 

  • Nagaraj, N. K., & Magimairaj, B. M. (2017). Role of working memory and lexical knowledge in perceptual restoration of interrupted speech. The Journal of the Acoustical Society of America, 142(6), 3756–3766.

    Article  PubMed  Google Scholar 

  • Nagaraj, N., Yang, J., Robinson, T., & Magimairaj, B. (2021). Auditory closure with visual cues and its relationship with memory abilities. The Journal of Acoustical Society of America-Express Letters, 1(9), 095202.

    Google Scholar 

  • Nation, I. S. P. (1990). Teaching and learning vocabulary. Newbury House.

    Google Scholar 

  • Osaka, M., & Osaka, N. (1992). Language-independent working memory as measured by Japanese and English reading span tests. Bulletin of the Psychonomic Society, 30(4), 287–289.

    Article  Google Scholar 

  • Padilla, M. (2003). English phoneme and word recognition by nonnative English speakers as a function of spectral resolution and English experience (UMI No. 3116765). Available from ProQuest Dissertations & Theses Global. (305319215). Retrieved from https://www.proquest.com/docview/305319215?pq-origsite=gscholar&fromopenview=true&sourcetype=Dissertations%20&%20Theses

  • Pisoni, D. B. (1993). Long-term memory in speech perception: Some new findings on talker variability, speaking rate and perceptual learning. Speech Communication, 13(1/2), 109–125.

    Article  PubMed  PubMed Central  Google Scholar 

  • Qian, D. D., & Lin, L. H. (2019). The relationship between vocabulary knowledge and language proficiency. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp. 66–80). Taylor & Francis.

    Chapter  Google Scholar 

  • Rudner, M., Rönnberg, J., & Lunner, T. (2011). Working memory supports listening in noise for persons with hearing impairment. Journal of the American Academy of Audiology, 22(3), 156–167.

    Article  PubMed  Google Scholar 

  • Service, E., Simola, M., Metsänheimo, O., & Maury, S. (2002). Bilingual working memory span is affected by language skill. European Journal of Cognitive Psychology, 14(3), 383–408.

    Article  Google Scholar 

  • Spahr, A. J., Dorman, M. F., Litvak, L. M., Van Wie, S., Gifford, R. H., Loizou, P. C., Loiselle, L. M., Oakes, T., & Cook, S. (2012). Development and validation of the AzBio sentence lists. Ear and Hearing, 33(1), 112–117.

    Article  PubMed  PubMed Central  Google Scholar 

  • Srinivasan, S., & Wang, D. (2005). A schema-based model for phonemic restoration. Speech Communication, 45(1), 63–87.

    Article  Google Scholar 

  • Stæhr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing. The Language Learning Journal, 36(2), 139–152.

    Article  Google Scholar 

  • Stæhr, L. S. (2009). Vocabulary knowledge and advanced listening comprehension in English as a foreign language. Studies in Second Language Acquisition, 31(4), 577–607.

    Article  Google Scholar 

  • Summerfield, Q. (1992). Lip reading and audiovisual speech-perception. Philosophical Transactions of the Royal Society of London. Series B-Biological Sciences, 335, 71–78.

    Google Scholar 

  • Uchihara, T., & Saito, K. (2019). Exploring the relationship between productive vocabulary knowledge and second language oral ability. The Language Learning Journal, 47(1), 64–75.

    Article  Google Scholar 

  • Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, R. W. (2005). An automated version of the operation span task. Behavior Research Methods, 37, 498–505.

    Article  PubMed  Google Scholar 

  • Unsworth, N., Spillers, G. J., & Brewer, G. A. (2010). Variation in verbal fluency: A latent variable analysis of clustering, switching, and overall performance. The Quarterly Journal of Experimental Psychology, 64(3), 447–466.

    Article  PubMed  Google Scholar 

  • Unsworth, N., Brewer, G. A., & Spillers, G. J. (2013). Working memory capacity and retrieval from long-term memory: The role of controlled search. Memory & Cognition, 41(2), 242–254.

    Article  Google Scholar 

  • Wang, X., & Humes, L. E. (2010). Factors influencing recognition of interrupted speech. The Journal of the Acoustical Society of America, 128(4), 2100–2111.

    Article  PubMed  PubMed Central  Google Scholar 

  • Wang, Y., Behne, D. M., & Jiang, H. (2008). Linguistic experience and audio-visual perception of nonnative fricatives. The Journal of the Acoustical Society of America, 124, 1716–1726.

    Article  PubMed  Google Scholar 

  • Wang, Y., Behne, D. M., & Jiang, H. (2009). Influence of native language phonetic system on audio-visual speech perception. Journal of Phonetics, 37, 344–356.

    Article  Google Scholar 

  • Warren, R. M. (1970). Perceptual restoration of missing speech sounds. Science, 167(3917), 392–393.

    Article  PubMed  Google Scholar 

  • Werker, J. F., Frost, P. E., & McGuirk, H. (1992). La langue et les lèvres: Cross-language influences on bimodal speech perception. Canadian Journal of Psychology/Revue canadienne de psychologie, 46(4), 551–568. https://doi.org/10.1037/h0084331

    Article  Google Scholar 

  • Williams, P. (2013). Working memory and SLA. In S. M. Gass & A. Mackey (Eds.), The handbook of second language acquisition (pp. 427–441). Routledge.

    Google Scholar 

  • Wingfield, A., & Tun, P. A. (2007). Cognitive supports and cognitive constraints on comprehension of spoken language. Journal of the American Academy of Audiology, 18, 548–558.

    Article  PubMed  Google Scholar 

  • Wu, Y. C., & Coulson, S. (2014). Co-speech iconic gestures and visuo-spatial working memory. Acta Psychologica, 153, 39–50.

    Article  PubMed  Google Scholar 

  • **, X., Li, P., & Prieto, P. (2023). Does visuospatial working memory predict L2 perceptual learning from phonetic training with hand gestures? In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 2646–2650). UK: Guarant International.

    Google Scholar 

  • **e, Z., Yi, H. G., & Chandrasekaran, B. (2014). Nonnative audiovisual speech perception in noise: Dissociable effects of the speaker and listener. PLOS ONE, 9(12), e114439.

    Article  PubMed  PubMed Central  Google Scholar 

  • Yang, J., Wagner, A., Zhang, Y., & Xu, L. (2022). Recognition of vocoded speech in English by Mandarin-speaking English-learners. Speech Communication, 136, 63–75.

    Article  Google Scholar 

  • Yi, H.-G., Phelps, J. E., Smiljanic, R., & Chandrasekaran, B. (2013). Reduced efficiency of audiovisual integration for nonnative speech. The Journal of the Acoustical Society of America, 134, EL387–EL393.

    Article  PubMed  Google Scholar 

  • Zareva, A., Schwanenflugel, P., & Nikolova, Y. (2005). Relationship between lexical competence and language proficiency: Variable sensitivity. Studies in Second Language Acquisition, 27(4), 567–595.

    Article  Google Scholar 

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation was performed by N.N. and B.M. Data collection and analysis were performed by J.Y. The first draft of the manuscript was written by J.Y., and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to **g Yang.

Ethics declarations

Conflicts of interest

The authors have no relevant financial or nonfinancial interests to disclose.

There are no conflicts of interest, financial, or otherwise.

Ethics approval

Approval was obtained from the ethics committee of the University of Wisconsin-Milwaukee (IRB No. 19.A.194).

Consent to participate

Informed consent was obtained from all individual participants included in the study.

Consent for publication

The authors affirm that human research participants provided informed consent for publication of the de-identified data.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open practices statement

The data generated for this study are available upon reasonable request, and the experiment was not preregistered.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Nagaraj, N.K. & Magimairaj, B.M. Audiovisual perception of interrupted speech by nonnative listeners. Atten Percept Psychophys (2024). https://doi.org/10.3758/s13414-024-02909-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.3758/s13414-024-02909-3

Keywords

Navigation