Search
Search Results
-
Construction of Amharic information retrieval resources and corpora
The development of information retrieval systems and natural language processing tools has been made possible for many natural languages because of...
-
Brazilian Portuguese corpora for teaching and translation: the CoMET project
This paper starts with an overview of corpora available for Brazilian Portuguese to subsequently focus mainly on the CoMET Project developed at the...
-
Syntactic annotation for Portuguese corpora: standards, parsers, and search interfaces
In the last two decades, four Portuguese syntactically annotated corpora were built along the lines initially defined for the Penn Parsed Historical...
-
Spoken Corpora of Slavic Languages
Spoken corpora are collections of transcribed and annotated audio and /or video recordings of languages or language varieties. The aim of this paper...
-
Corpora and Translation Education Advances and Challenges
This edited book covers a range of topics related to the use of corpora in translation education, including their standing in corpus-based...
-
COVID-19 Corpora
The COVID-19 pandemic has had a profound effect on all aspects of society. As a component of this society, the academic and scientific community is... -
Investigating Appraisal and the Language of Evaluation in Fake News Corpora
The present corpus study, which is grounded in Appraisal Theory, investigates evaluative language use in fake news in English. The primary aim is to...
-
Investigating interoperable event corpora: limitations of reusability of resources and portability of models
Studies on the applicability of heterogeneous semantically interoperable corpora are rare. We investigate to what extent reusability (both of systems...
-
Building the Leeds Monolingual and Parallel Legal Corpora of Arabic and English Countries’ Constitutions: Methods, Challenges and Solutions
Arabic corpora have existed since the last decade of the past century. Although they are constantly increasing, more advanced tools and...
-
The ParlaMint corpora of parliamentary proceedings
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion...
-
Two sepedi-english code-switched speech corpora
We report on the development of two reference corpora for the analysis of Sepedi-English code-switched speech in the context of automatic speech...
-
Data Acquisition and Other Technical Challenges in Learner Corpora and Translation Learner Corpora
Learner corpora and translation corpora are attracting more and more attention from the language education community as well as theorists of second... -
Strategies for the Analysis of Large Social Media Corpora: Sampling and Keyword Extraction Methods
In the context of the COVID-19 pandemic, social media platforms such as Twitter have been of great importance for users to exchange news, ideas, and...
-
Style: Text, Cognition and Corpora
In this chapter, we discuss the notion of style and the analytical approaches used by stylisticians to shed light on how style is created in texts.... -
Making Sense of Large Social Media Corpora Keywords, Topics, Sentiment, and Hashtags in the Coronavirus Twitter Corpus
This open access book offers a comprehensive overview of available techniques and approaches to explore large social media corpora, using as an...
-
Corpus tools for parallel corpora of theatre plays: an introduction to TAligner and ACM-theatre
Software tools are of vital importance in corpus-based research, but they can also lead to restrictions on the type of supported corpora and the...
-
Develop corpora and methods for cross-lingual text reuse detection for English Urdu language pair at lexical, syntactical, and phrasal levels
In recent years, Cross-Lingual Text Reuse Detection (CLTRD) has attracted the attention of the research community because large digital repositories...
-
Exploring the Leipzig Corpora Collection in the LSP Classroom: A Data-Driven Approach
The collection and analysis of large text corpora in different languages assume a fundamental role in an increasingly complex globalized world,... -
Corpora compilation for prosody-informed speech processing
Research on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the...