Log in

A Semi-automatic Error Retrieval Method for Uncovering Collocation Errors from a Large Learner Corpus

以半自動化擷取方法探究大型學習者語料庫之搭配詞錯誤

  • Original Paper
  • Published:
English Teaching & Learning Aims and scope Submit manuscript

Abstract

Previous studies on ESL/EFL learners’ verb-noun (V-N) miscollocations have shed some light on common miscollocation types and possible causes. However, barriers to further understanding of learners’ difficulties still exist, such as the limited amount of learner data generated from small corpora and the labor-intensive process of manually retrieving collocational errors. To provide researchers with a more efficient retrieval method, this study proposed the use of the Sketch-Diff function in the Sketch Engine (SKE) platform to semi-automatically retrieve collocation errors in large learner corpora. To test the feasibility of this semi-automatic retrieval method, a 7.4-million-word EFL learner corpus was investigated with Sketch-Diff, and 4541 tokens of common miscollocations were identified. Analysis of these miscollocations revealed that most errors were verb-based and often caused by negative transfer from the learners’ L1, undergeneralization (e.g., ignorance of L2 syntactic rules), and approximation (e.g., the misuse of near-synonyms, hyper-/hyponyms, antonyms, and lexemes with similar sound/form). This study demonstrates that using Sketch-Diff to retrieve V-N miscollocations from a large learner corpus is both feasible and efficient. This method can be applied to other languages to further deepen our understanding of L2 learners’ difficulties in collocation acquisition.

中文摘要

過去針對以英語為二語/外語的學習者的動詞—名詞(動-名)錯誤搭配研究, 已揭示常見之錯誤類型與可能成因。然而, 此類研究多採用較為耗費人力之搭配詞檢索方式, 來探究小型語料庫之錯誤情形, 造成學者難以將此類擷取方法應用至大型語料庫。為提供研究者更有效率之搭配詞錯誤擷取方法, 本文提倡使用Sketch Engine語料處理**台的Sketch-Diff功能, 以半自動化方式自大型學習者語料庫中擷取搭配詞錯誤。為測試此半自動化擷取方式之可行性, 本文以Sketch-Diff檢驗一座七百四十萬字的學習者語料庫中的錯誤搭配錯誤情形, 並判別出4,541筆常見之動─名錯誤搭配詞。分析結果指出多數錯誤為動詞之誤用, 而歸咎其成因則多為學習者之母語負遷移、生成不足(如:忽略二/外語句法規則)以及相**表達誤用(如:**義詞誤用、上/下位詞誤用、反義詞誤用以及形/音**詞誤用)。本文研究結果表明Sketch-Diff功能可有效以半自動化方式自大型學習者語料庫擷取動—名搭配詞錯誤, 並建議應用此研究方法至其他語言, 以進一步加深對二/外語學習者搭配詞習得困難之理解。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The formula of logDice score and its features are presented in Rychlý [35].

  2. In the context of World Englishes (WE), some people might question the suitability of using native speakers as the norm to decide the acceptability of V-N collocations produced by language learners. Some might argue that learners’ atypical collocations (i.e., collocations that are rarely/never used by native speakers) could still be considered acceptable as long as their meanings are comprehensible. In the context of English writing, however, this might not be the case. As pointed out by Matsuda and Matsuda [28], many teachers have a stricter standard in students’ written production and tend to assign lower scores on writing assignments with language features that are deviant from the native norm. Even students expect more corrective markings from their teachers on their written texts. Since the current study aims to uncover English leaners’ collocation uses in writing, we argue that deviant collocations generated from the comparison with native speakers’ written texts should be considered problematic expressions that need to be dealt with for the better teaching/learning of English writing.

References

  1. Aitchison, J. (1987). Words in the mind. Hoboken: Blackwell Publishing.

    Google Scholar 

  2. al-Hassnawi, H. (2017). Analysis of collocational errors in selected Iraqi published papers. Adab Al-Kufa, 31, 9–24.

    Google Scholar 

  3. Altenberg, B. (1993). Recurrent verb-complement constructions in the London-Lund Corpus. English language corpora: design, analysis and exploitation, 227–245.

  4. Bahns, J., & Eldaw, M. (1993). Should we teach EFL students collocations? System, 21(1), 101–114.

    Article  Google Scholar 

  5. Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. London: Longman.

    Google Scholar 

  6. Brown, D. F. (1974). Advanced vocabulary teaching: the problem of collocation. RELC Journal, 5(2), 1–11.

    Article  Google Scholar 

  7. Channel, J. (1981). Applying semantic theory to vocabulary teaching. English Language Teaching Journal, 35(2), 115–122.

    Article  Google Scholar 

  8. Chen, M. H. (2008). A study of English collocation competence of college students in Taiwan. Master’s thesis, National Taiwan University of Science and Technology, Taiwan.

  9. Chen, H. J. H. (2011). Develo** and evaluating a web-based collocation retrieval tool for EFL students and teachers. Computer assisted language learning, 24(1), 59–76.

    Article  Google Scholar 

  10. Chorbwhan, R., & McLellan, J. (2016). First language transfer and the acquisition of English collocations by Thai learners. Southeast Asia: A Multidisciplinary Journal, 16, 16–27.

    Google Scholar 

  11. Nattinger, J. R., & DeCarrico, J. S. (1992). Lexical phrases and language teaching. Oxford: Oxford University Press.

    Google Scholar 

  12. Ellis, N. C. (1996). Sequencing in SLA. Studies in second language acquisition, 18(1), 91–126.

    Article  Google Scholar 

  13. Fan, M. (2009). An exploratory study of collocational use by ESL students–a task based approach. System, 37(1), 110–123.

    Article  Google Scholar 

  14. Gitsaki, C. (1999). Second language lexical acquisition: a study of the development of collocational knowledge. Maryland: International Scholars Publications.

    Google Scholar 

  15. Granger, S. (1998). Prefabricated patterns in advanced EFL writing: collocations and lexical phrases. In A. P. Cowie (Ed.), Phraseology: theory, analysis and applications (pp. 145–160). Oxford: OUP.

    Google Scholar 

  16. Hong, A. L., Rahim, H. A., Hua, T. K., & Salehuddin, K. (2011). Collocations in Malaysian English learners’ writing: a corpus-based error analysis. 3L: language, linguistics, literature®, 17(SI), 31–44.

    Google Scholar 

  17. James, C. (1998). Errors in language learning and use: exploring error analysis. Abingdon: Routledge.

    Google Scholar 

  18. Juknevičienė, R. (2008). Collocations with high-frequency verbs in learner English: Lithuanian learners vs native speakers. Kalbotyra, 59(3), 119–127.

    Article  Google Scholar 

  19. Laufer, B., & Waldman, T. (2011). Verb-noun collocations in second language writing: a corpus analysis of learners’ English. Language learning, 61(2), 647–672.

    Article  Google Scholar 

  20. Lewis, M. (2000). Teaching collocation: further development in the lexical approach. London: Language Teaching Publications.

    Google Scholar 

  21. Li, C. C. (2005). A study of collocational error types in ESL/EFL college learners’ writing. Master’s thesis, Ming Chuan University, Taiwan.

  22. Lien, H. Y. (2003), The effects of collocation instruction on the reading comprehension of Taiwanese college students. PhD dissertation, Indiana University of Pennsylvania, US.

  23. Liu, C. P. (1999). An analysis of collocational errors in EFL writings. In The proceedings of the eighth international symposium on English teaching (pp. 483–494). Taipei: Crane Publishing Co., Ltd..

    Google Scholar 

  24. Liu, L. E. (2002). A corpus-based semantic investigation of verb-noun miscollocations in Taiwan learners’ English. Master’s thesis, Tamkang University, Taiwan.

  25. Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing (Vol. 999). Cambridge: MIT Press.

    Google Scholar 

  26. Marco, M. J. L. (2011). Exploring atypical verb+noun combinations in learner technical writing. International Journal of English Studies, 11(2), 77–95.

    Article  Google Scholar 

  27. Martinez, R., & Schmitt, N. (2012). A phrasal expressions list. Applied Linguistics, 33(3), 299–320.

    Article  Google Scholar 

  28. Matsuda, A., & Matsuda, P. K. (2010). World Englishes and the teaching of writing. Tesol Quarterly, 44(2), 369–374.

    Article  Google Scholar 

  29. Men, H. (2016). Vocabulary increase and collocation learning. Shanghai: Shanghai Jiao Tong University Press.

    Google Scholar 

  30. Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24(2), 223–242.

    Article  Google Scholar 

  31. Nesselhauf, N. (2005). Collocations in a learner corpus. Amsterdam: John Benjamins.

    Book  Google Scholar 

  32. Nguyen, T. M. H., & Webb, W. (2017). Examining second language receptive knowledge of collocation and factors that affect learning. Language Teaching Research, 21(3), 298–320.

    Article  Google Scholar 

  33. Paquot, M., & Granger, S. (2012). Formulaic language in learner corpora. Annual Review of Applied Linguistics, 32, 130–149.

    Article  Google Scholar 

  34. Pawley, A., & Syder, F. H. (1983). Two puzzles for linguistic theory: nativelike selection and nativelike fluency. Language and communication, 191, 225.

    Google Scholar 

  35. Rychlý, P. (2008). A lexicographer-friendly association score. In RASLAN (pp. 6–9).

  36. Schmitt, N. (2010). Researching vocabulary: a vocabulary research manual. Basingstoke: Palgrave Macmillan.

    Book  Google Scholar 

  37. Sinclair, J. (1991). Corpus, concordance, collocation. Hong Kong: Oxford University Press.

    Google Scholar 

  38. Stenson, N. (1983). Induced errors. In B. W. Robinett & J. Schachter (Eds.), Second language learning: contrastive analysis, error analysis and related aspects (pp. 256–271). Ann Arbor: University of Michigan Press.

    Google Scholar 

  39. Van Rooy, B., & Schäfer, L. (2002). The effect of learner errors on POS tag errors during automatic POS tagging. Southern African linguistics and applied language studies, 20(4), 325–335.

    Article  Google Scholar 

  40. Wang, C. J. (2001). A study of the English collocational competence of English majors in Taiwan. Master’s thesis, Fu Jen Catholic University, Taiwan.

  41. Wang, Y., & Shaw, P. (2008). Transfer and universality: collocation use in advanced Chinese and Swedish learner English. ICAME journal, 32, 201–232.

    Google Scholar 

  42. Woolard, G. (2000). Collocation-encouraging learner independence. In M. Lewis (Ed.), Teaching collocation: further developments in the lexical approach (pp. 28–46). Hove: Language Teaching Publications.

    Google Scholar 

  43. Wray, A. (2002). Formulaic language and the lexicon. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  44. Wray, A. (2008). Formulaic language: pushing the boundaries. Oxford: Oxford University Press.

    Google Scholar 

  45. Wu, W. S. (1996). Lexical collocations: one way to make passive vocabulary active. In The Proceedings of the 11th Conference on English Teaching and Learning in the Republic of China (pp. 461-480).

  46. Zhang, X. (2017). Effects of receptive-productive integration tasks and prior knowledge of component words on L2 collocation development. System, 66, 156–167.

    Article  Google Scholar 

  47. Zhang, Y., & Gao, Y. (2006). A CLEC-based study of collocation acquisition by Chinese English language learners. CELEA Journal, 29(4), 28–35.

    Google Scholar 

  48. Zhang, W. Z., & Yang, S. (2009). An analysis of V-N collocation errors in CLEC. Journal of PLA University of Foreign Languages, 32(2), 39–44.

    Google Scholar 

Download references

Funding

This work was financially supported by the “Chinese Language and Technology Center” of National Taiwan Normal University (NTNU) from The Featured Areas Research Center Program within the framework of the Higher Education Sprout Project by the Ministry of Education (MOE) in Taiwan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Howard Hao-Jan Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, C.TY., Chen, H.HJ., Liu, CY. et al. A Semi-automatic Error Retrieval Method for Uncovering Collocation Errors from a Large Learner Corpus. English Teaching & Learning 44, 1–19 (2020). https://doi.org/10.1007/s42321-019-00037-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42321-019-00037-y

Keywords

關鍵詞

Navigation