Writing up a Corpus-Linguistic Paper

  • Chapter
  • First Online:
A Practical Handbook of Corpus Linguistics

Abstract

In this chapter, we provide a brief characterization of what we consider the best and most common structure that empirical corpus-linguistic papers can and should have. In particular, we first introduce the four major parts of a corpus linguistics paper: “Introduction”, “Methods”, “Results”, and “Discussion”. Since the nature of corpus data and corpus techniques makes the two sections very field-specific, we then focus more particularly on the “Methods” and “Discussion” sections of a typical quantitative corpus linguistic paper. We provide recommendations that span the research cycle from data description to analyzing the dataset and reporting the results of statistical tests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 106.99
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info
Hardcover Book
EUR 106.99
Price includes VAT (Germany)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    This is also a means of bringing credit and recognition to all those involved in corpus compilation.

  2. 2.

    See Gries (in press) for more information about how to carry out the tasks of retrieval and annotation discussed above.

References

  • American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: American Psychological Association.

    Google Scholar 

  • Berez-Kroeker, A., Gawne, L., Kung, S., et al. (2017). Reproducible research in linguistics: A position statement on data citation and attribution in our field. Linguistics, 56(1), 1–18.

    Article  Google Scholar 

  • BNC Consortium. (2001). The British National Corpus, version 2 (BNC World). Distributed by Oxford University Computing Services on behalf of the BNC Consortium. http://www.natcorp.ox.ac.uk/. Accessed 30 August 2019.

  • Branco, A., Cohen, K. B., Vossen, P., Ide, N., & Calzolari, N. (2017). Replicability and reproducibility of research results for human language technology : Introducing an LRE special section. Language Resources and Evaluation, 51(1), 1–5.

    Article  Google Scholar 

  • Cleveland, W., & McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific data. Science, 229(4716), 828–833.

    Article  Google Scholar 

  • Fox, J. (2003). Effect displays in R for generalised linear models. Journal of Statistical Software, 8(15), 1–27.

    Article  Google Scholar 

  • Fox, J., & Hong, J. (2009). Effect displays in R for multinomial and proportional-odds logit models: Extensions to the effects package. Journal of Statistical Software, 32(1), 1–24.

    Article  Google Scholar 

  • Fuoli, M., & Hommerberg, C. (2015). Optimising transparency, reliability and replicability: Annotation principles and inter-coder agreement in the quantification of evaluation expressions. Corpora, 10(3), 315–349.

    Article  Google Scholar 

  • Gries, S. Th. (2013). Statistics for linguistics with R (2nd rev. & ext. ed.). Boston/New York: De Gruyter Mouton.

    Book  Google Scholar 

  • Gries, S. Th. (2016a). Variationist analysis: Variability due to random effects and autocorrelation. In P. Baker & J. A. Egbert (Eds.), Triangulating methodological approaches in corpus linguistic research (pp. 108–123). New York: Routledge, Taylor and Francis.

    Google Scholar 

  • Gries, S. Th. (2016b). Quantitative corpus linguistics with R. 2nd rev. & ext. ed. New York & London: Routledge, Taylor & Francis Group.

    Google Scholar 

  • Gries, S. Th. (in press). Managing synchronic corpus data with the British National Corpus (BNC). In A.L. Berez-Kroeker, B. McDonnell, E. Koller, & L. Collister (Eds.), MIT open handbook of linguistic data management. Cambridge, MA: The MIT Press

    Google Scholar 

  • Kuhn, M., & Johnson, K. (2013). Applied predictive modeling. Berlin/New York: Springer.

    Book  Google Scholar 

  • Loewen, S., & Plonsky, L. (2015). An A-Z of applied linguistics research methods. New York: Palgrave.

    Google Scholar 

  • Marsden, E., Mackey, A., & Plonsky, L. (2016). The IRIS repository: Advancing research practice and methodology. In A. Mackey & E. Marsden (Eds.), Advancing methodology and practice: The IRIS repository of instruments for research into second languages (pp. 1–21). New York: Routledge.

    Google Scholar 

  • Paquot, M., & Plonsky, L. (2017). Quantitative research methods and study quality in learner corpus research. International Journal of Learner Corpus Research, 3(1), 61–94.

    Article  Google Scholar 

  • Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687.

    Article  Google Scholar 

  • Porte, G. (2012). Replication research in applied linguistics. Cambridge: Cambridge University Press.

    Google Scholar 

  • Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. Proceedings of international conference on new methods in language processing, Manchester, UK.

    Google Scholar 

  • Spooren, W., & Degand, L. (2010). Coding coherence relations: Reliability and validity. Corpus Linguistics and Linguistic Theory, 6(2), 241–266.

    Article  Google Scholar 

  • Tufte, E. (2001). The visual display of quantitative information (2nd ed.). Graphics Press: Cheshire, CT.

    Google Scholar 

  • Wilkinson, L., & The Task Force on Statistical Inference. (1999). Statistical methods in psychology journals. American Psychologist, 54(8), 594–604.

    Article  Google Scholar 

  • Wulff, S., Gries, S. Th., & Lester, N. A. (2018). Optional that in complementation by German and Spanish learners: Where and how German and Spanish learners differ from native speakers. In A. Tyler, L. Huan, & H. Jan (Eds.), What does applied cognitive linguistics look like? Answers from the L2 classroom and SLA studies (pp. 97–118). Berlin & Boston: De Gruyter Mouton.

    Google Scholar 

  • Zuur, A. F., Ieno, E. N., & Elphick, C. S. (2010). A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1(1), 3–14.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefan Th. Gries .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gries, S.T., Paquot, M. (2020). Writing up a Corpus-Linguistic Paper. In: Paquot, M., Gries, S.T. (eds) A Practical Handbook of Corpus Linguistics. Springer, Cham. https://doi.org/10.1007/978-3-030-46216-1_26

Download citation

Publish with us

Policies and ethics

Navigation