Abstract
The purpose of the present study was to examine the influence of visual cues in audiovisual perception of interrupted speech by nonnative English listeners and to identify the role of working memory, long-term memory retrieval, and vocabulary knowledge in audiovisual perception by nonnative listeners. The participants included 31 Mandarin-speaking English learners between 19 and 41 years of age. The perceptual stimuli were noise-filled periodically interrupted AzBio and QuickSIN sentences with or without visual cues that showed a male speaker uttering the sentences. In addition to sentence recognition, the listeners completed a semantic fluency task, verbal (operation span) and visuospatial (symmetry span) working memory tasks, and two vocabulary knowledge tests (Vocabulary Level Test and Lexical Test for Advanced Learners of English). The results revealed significantly better speech recognition in the audio-visual condition than the audio-only condition, but the magnitude of visual benefit was substantially attenuated for sentences that had limited semantic context. The listeners’ vocabulary size in English played a key role in the restoration of missing speech information and audiovisual integration in the perception of interrupted speech. Meanwhile, the listeners’ verbal working memory capacity played an important role in audiovisual integration especially for the difficult stimuli with limited semantic context.
Similar content being viewed by others
Code availability
Not applicable
References
Bashford, J. A., Warren, R. M., & Brown, C. A. (1996). Use of speech-modulated noise adds strong “bottom-up” cues for phonemic restoration. Perception & Psychophysics, 58(3), 342–350.
Bradlow, A. R., & Alexander, J. A. (2007). Semantic and phonetic enhancements for speech-in-noise recognition by native and nonnative listeners. The Journal of the Acoustical Society of America, 121(4), 2339–2349.
Bradlow, A. R., & Bent, T. (2002). The clear speech effect for nonnative listeners. Journal of the Acoustical Society of America, 112(1), 272–284.
Bundgaard-Nielsen, R. L., Best, C. T., & Tyler, M. D. (2011). Vocabulary size matters: The assimilation of second-language Australian English vowels to first-language Japanese vowel categories. Applied Psycholinguistics, 32(1), 51–67.
Burfin, S., Pascalis, O., Ruiz Tada, E., Costa, A., Savariaux, C., & Kandel, S. (2014). Bilingualism affects audiovisual phoneme identification. Frontiers in Psychology, 5, 1179.
Campbell, R. (2008). The processing of audio-visual speech: Empirical and neural bases. Philosophical Transactions of the Royal Society B: Biological Sciences, 363(1493), 1001–1010.
Cebrian, J., & Carlet, A. (2012). Audiovisual perception of native and nonnative sounds by native and nonnative speakers. In S. Martin Alegre, M. Moyer, E. Pladevall, & S. Tubau (Eds.), At a time of crisis: English and American studies in Spain (pp. 300–307). Universitat Autònoma de Barcelona.
Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., & Ghazanfar, A. A. (2009). The natural statistics of audiovisual speech. PLOS Computational Biology, 5, e1000436.
Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213–238.
Cunillera, T., Càmara, E., Laine, M., & Rodríguez-Fornells, A. (2010). Speech segmentation is facilitated by visual cues. The Quarterly Journal of Experimental Psychology, 63(2), 260–274.
Cutler, A., Cooke, M., Garcia-Lecumberri, M. L., & Pasveer, D. (2007). L2 consonant identification in noise: Cross-language comparisons. INTERSPEECH 2007: Proceedings of the 8th Annual Conference of the International Speech Communication Association (pp. 1585–1588). ISCA.
Cutler, A., Weber, A., Smits, R., & Cooper, N. (2004). Patterns of English phoneme confusions by native and nonnative listeners. The Journal of the Acoustical Society of America, 116(6), 3668–3678.
Davis, C., & Kim, J. (2004). Audio-visual interactions with intact clearly audible speech. Quarterly Journal of Experimental Psychology, 57, 1103–1121.
Drijvers, L., & Özyürek, A. (2017). Visual context enhanced: The joint contribution of iconic gestures and visible speech to degraded speech comprehension. Journal of Speech, Language, and Hearing Research, 60(1), 212–222.
Drijvers, L., & Özyürek, A. (2018). Native language status of the listener modulates the neural integration of speech and iconic gestures in clear and adverse listening conditions. Brain and Language, 177, 7–17.
Drijvers, L., & Özyürek, A. (2020). nonnative listeners benefit less from gestures and visible speech than native listeners during degraded speech comprehension. Language and Speech, 63(2), 209–220.
Exton, E., & Newman, R. (2023). The impact of language background and listening-in-noise on the phonemic restoration effect. PsyAr**v Preprint. https://doi.org/10.31234/osf.io/p8nq4
Foo, C., Rudner, M., Rönnberg, J., & Lunner, T. (2007). Recognition of speech in noise with new hearing instrument compression release settings requires explicit cognitive storage and processing capacity. Journal of the American Academy of Audiology, 18, 618–631.
Füllgrabe, C., & Rosen, S. (2016). On the (un)importance of working memory in speech-in-noise processing for listeners with normal hearing thresholds. Frontiers in Psychology, 7, 196991.
Grant, K. W., & Seitz, P. F. (1998). Measures of auditory–visual integration in nonsense syllables and sentences. The Journal of the Acoustical Society of America, 104(4), 2438–2450.
Grossberg, S., & Kazerounian, S. (2011). Laminar cortical dynamics of conscious speech perception: Neural model of phonemic restoration using subsequent context in noise. The Journal of the Acoustical Society of America, 130(1), 440–460.
Hannah, B., Wang, Y., Jongman, A., & Sereno, J. A. (2016). Cross-modal association between auditory and visual-spatial information in Mandarin tone perception. The Journal of the Acoustical Society of America, 140(4), 3225–3225.
Hardison, D. M. (1996). Bimodal speech perception by native and nonnative speakers of English: Factors influencing the McGurk effect. Language Learning, 46(1), 3–73.
Hardison, D. M. (2003). Acquisition of second-language speech: Effects of visual cues, context, and talker variability. Journal of Applied Psychology, 24, 495–522.
Hazan, V., Sennema, A., Iba, M., & Faulkner, A. (2005). Effect of audiovisual perceptual training on the perception and production of consonants by Japanese learners of English. Speech Communication, 47, 360–378.
Hazan, V., Sennema, A., Faulkner, A., Ortega-Llebaria, M., Iba, M., & Chung, H. (2006). The use of visual cues in the perception of nonnative consonant contrasts. The Journal of the Acoustical Society of America, 119, 1740–1751.
Hazan, V., Kim, J., & Chen, Y. (2010). Audiovisual perception in adverse conditions: Language, speaker and listener effects. Speech Communication, 52(11/12), 996–1009.
Hedden, T., Lautenschlager, G., & Park, D. C. (2005). Contributions of processing ability and knowledge to verbal memory tasks across the adult life-span. The Quarterly Journal of Experimental Psychology A, 58, 169–190.
Inceoglu, S. (2019). Individual differences in L2 speech perception: The role of phonological memory and lipreading ability. The Modern Language Journal, 103(4), 782–799.
Ishida, M. (2021). Perceptual restoration of locally time-reversed speech: Nonnative listeners’ performance in their L2 vs. L1. Attention, Perception, & Psychophysics, 83(6), 2675–2693.
Ishida, M., & Arai, T. (2016). Missing phonemes are perceptually restored but differently by native and nonnative listeners. SpringerPlus, 5, 713.
Juffs, A., & Harrington, M. (2011). Aspects of working memory in L2 learning. Language Teaching, 44(2), 137–166.
Karakoç, D., & Köse, G. D. (2017). The impact of vocabulary knowledge on reading, writing and proficiency scores of EFL learners. Journal of Language and Linguistic Studies, 13(1), 352–378.
Killion, M. C., Niquette, P. A., Gudmundsen, G. I., Revit, L. J., & Banerjee, S. (2004). Development of a quick speech-in-noise test for measuring signal-to-noise ratio loss in normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America, 116(4), 2395–2405.
Klein, K. E., Walker, E. A., Kirby, B., & McCreery, R. W. (2017). Vocabulary facilitates speech perception in children with hearing aids. Journal of Speech, Language, and Hearing Research, 60(8), 2281–2296.
Lansing, C. R., & McConkie, G. W. (1999). Attention to facial regions in segmental and prosodic visual speech perception tasks. Journal of Speech, Language, and Hearing Research, 42(3), 526–539.
Lemhöfer, K., & Broersma, M. (2012). Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behavior Research Methods, 44, 325–343.
Linck, J. A., Osthus, P., Koeth, J. T., & Bunting, M. F. (2014). Working memory and second language comprehension and production: A meta-analysis. Psychonomic Bulletin & Review, 21(4), 861–883.
Lunner, T., & Sundewall-Thorén, E. (2007). Interactions between cognition, compression, and listening condition: Effects on speech-in-noise performance in a two-channel hearing aid. Journal of the American Academy of Audiology, 18(7), 604–617.
Marchman, V. A., Fernald, A., & Hurtado, N. (2010). How vocabulary size in two languages relates to efficiency in spoken word recognition by young Spanish-English bilinguals. Journal of Child Language, 37(4), 817–840.
Mayo, L. H., Florentine, M., & Buus, S. (1997). Age of second-language acquisition and perception of speech in noise. Journal of Speech, Language, and Hearing Research, 40(3), 686–693.
Michael, E. B., & Gollan, T. (2005). Being and becoming bilingual: Individual differences and consequences for language production. In J. F. Kroll & A. M. B. De Groot (Eds.), Handbook of bilingualism: Psycholinguistic approaches (pp. 398–408). Oxford University Press.
Milton, J. (2013). Measuring the contribution of vocabulary knowledge to proficiency in the four skills. In C. Bardel, C. Lindqvist, & B. Laufer (Eds.), L2 Vocabulary acquisition knowledge and use new perspectives on assessment and corpus analysis (pp. 57–78). Eurosla Monograph Series 2.
Mohd Nasir, N. F. W., Ab Manan, N. A., & Azizan, N. (2017). Examining the relationship between vocabulary knowledge and general English language proficiency. ESTEEM Journal of Social Sciences and Humanities, 1, 15–22.
Munro, M. J., & Derwing, T. M. (1995). Processing time, accent, and comprehensibility in the perception of native and foreign-accented speech. Language and Speech, 38(3), 289–306.
Munson, B. (2001). Relationships between vocabulary size and spoken word recognition in children aged 3 to 70. Contemporary Issues in Communication Science and Disorders, 28(Spring), 20–29.
Nagaraj, N. K., & Knapp, A. N. (2015). No evidence of relation between working memory and perception of interrupted speech in young adults. The Journal of the Acoustical Society of America, 138(2), EL145–EL150.
Nagaraj, N. K., & Magimairaj, B. M. (2017). Role of working memory and lexical knowledge in perceptual restoration of interrupted speech. The Journal of the Acoustical Society of America, 142(6), 3756–3766.
Nagaraj, N., Yang, J., Robinson, T., & Magimairaj, B. (2021). Auditory closure with visual cues and its relationship with memory abilities. The Journal of Acoustical Society of America-Express Letters, 1(9), 095202.
Nation, I. S. P. (1990). Teaching and learning vocabulary. Newbury House.
Osaka, M., & Osaka, N. (1992). Language-independent working memory as measured by Japanese and English reading span tests. Bulletin of the Psychonomic Society, 30(4), 287–289.
Padilla, M. (2003). English phoneme and word recognition by nonnative English speakers as a function of spectral resolution and English experience (UMI No. 3116765). Available from ProQuest Dissertations & Theses Global. (305319215). Retrieved from https://www.proquest.com/docview/305319215?pq-origsite=gscholar&fromopenview=true&sourcetype=Dissertations%20&%20Theses
Pisoni, D. B. (1993). Long-term memory in speech perception: Some new findings on talker variability, speaking rate and perceptual learning. Speech Communication, 13(1/2), 109–125.
Qian, D. D., & Lin, L. H. (2019). The relationship between vocabulary knowledge and language proficiency. In S. Webb (Ed.), The Routledge handbook of vocabulary studies (pp. 66–80). Taylor & Francis.
Rudner, M., Rönnberg, J., & Lunner, T. (2011). Working memory supports listening in noise for persons with hearing impairment. Journal of the American Academy of Audiology, 22(3), 156–167.
Service, E., Simola, M., Metsänheimo, O., & Maury, S. (2002). Bilingual working memory span is affected by language skill. European Journal of Cognitive Psychology, 14(3), 383–408.
Spahr, A. J., Dorman, M. F., Litvak, L. M., Van Wie, S., Gifford, R. H., Loizou, P. C., Loiselle, L. M., Oakes, T., & Cook, S. (2012). Development and validation of the AzBio sentence lists. Ear and Hearing, 33(1), 112–117.
Srinivasan, S., & Wang, D. (2005). A schema-based model for phonemic restoration. Speech Communication, 45(1), 63–87.
Stæhr, L. S. (2008). Vocabulary size and the skills of listening, reading and writing. The Language Learning Journal, 36(2), 139–152.
Stæhr, L. S. (2009). Vocabulary knowledge and advanced listening comprehension in English as a foreign language. Studies in Second Language Acquisition, 31(4), 577–607.
Summerfield, Q. (1992). Lip reading and audiovisual speech-perception. Philosophical Transactions of the Royal Society of London. Series B-Biological Sciences, 335, 71–78.
Uchihara, T., & Saito, K. (2019). Exploring the relationship between productive vocabulary knowledge and second language oral ability. The Language Learning Journal, 47(1), 64–75.
Unsworth, N., Heitz, R. P., Schrock, J. C., & Engle, R. W. (2005). An automated version of the operation span task. Behavior Research Methods, 37, 498–505.
Unsworth, N., Spillers, G. J., & Brewer, G. A. (2010). Variation in verbal fluency: A latent variable analysis of clustering, switching, and overall performance. The Quarterly Journal of Experimental Psychology, 64(3), 447–466.
Unsworth, N., Brewer, G. A., & Spillers, G. J. (2013). Working memory capacity and retrieval from long-term memory: The role of controlled search. Memory & Cognition, 41(2), 242–254.
Wang, X., & Humes, L. E. (2010). Factors influencing recognition of interrupted speech. The Journal of the Acoustical Society of America, 128(4), 2100–2111.
Wang, Y., Behne, D. M., & Jiang, H. (2008). Linguistic experience and audio-visual perception of nonnative fricatives. The Journal of the Acoustical Society of America, 124, 1716–1726.
Wang, Y., Behne, D. M., & Jiang, H. (2009). Influence of native language phonetic system on audio-visual speech perception. Journal of Phonetics, 37, 344–356.
Warren, R. M. (1970). Perceptual restoration of missing speech sounds. Science, 167(3917), 392–393.
Werker, J. F., Frost, P. E., & McGuirk, H. (1992). La langue et les lèvres: Cross-language influences on bimodal speech perception. Canadian Journal of Psychology/Revue canadienne de psychologie, 46(4), 551–568. https://doi.org/10.1037/h0084331
Williams, P. (2013). Working memory and SLA. In S. M. Gass & A. Mackey (Eds.), The handbook of second language acquisition (pp. 427–441). Routledge.
Wingfield, A., & Tun, P. A. (2007). Cognitive supports and cognitive constraints on comprehension of spoken language. Journal of the American Academy of Audiology, 18, 548–558.
Wu, Y. C., & Coulson, S. (2014). Co-speech iconic gestures and visuo-spatial working memory. Acta Psychologica, 153, 39–50.
**, X., Li, P., & Prieto, P. (2023). Does visuospatial working memory predict L2 perceptual learning from phonetic training with hand gestures? In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 2646–2650). UK: Guarant International.
**e, Z., Yi, H. G., & Chandrasekaran, B. (2014). Nonnative audiovisual speech perception in noise: Dissociable effects of the speaker and listener. PLOS ONE, 9(12), e114439.
Yang, J., Wagner, A., Zhang, Y., & Xu, L. (2022). Recognition of vocoded speech in English by Mandarin-speaking English-learners. Speech Communication, 136, 63–75.
Yi, H.-G., Phelps, J. E., Smiljanic, R., & Chandrasekaran, B. (2013). Reduced efficiency of audiovisual integration for nonnative speech. The Journal of the Acoustical Society of America, 134, EL387–EL393.
Zareva, A., Schwanenflugel, P., & Nikolova, Y. (2005). Relationship between lexical competence and language proficiency: Variable sensitivity. Studies in Second Language Acquisition, 27(4), 567–595.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation was performed by N.N. and B.M. Data collection and analysis were performed by J.Y. The first draft of the manuscript was written by J.Y., and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflicts of interest
The authors have no relevant financial or nonfinancial interests to disclose.
There are no conflicts of interest, financial, or otherwise.
Ethics approval
Approval was obtained from the ethics committee of the University of Wisconsin-Milwaukee (IRB No. 19.A.194).
Consent to participate
Informed consent was obtained from all individual participants included in the study.
Consent for publication
The authors affirm that human research participants provided informed consent for publication of the de-identified data.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open practices statement
The data generated for this study are available upon reasonable request, and the experiment was not preregistered.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, J., Nagaraj, N.K. & Magimairaj, B.M. Audiovisual perception of interrupted speech by nonnative listeners. Atten Percept Psychophys (2024). https://doi.org/10.3758/s13414-024-02909-3
Accepted:
Published:
DOI: https://doi.org/10.3758/s13414-024-02909-3