Search
Search Results
-
Multi-layered semantic annotation and the formalisation of annotation schemas for the investigation of modality in a Latin corpus
This paper stems from the project A World of Possibilities. Modal pathways over an extra-long period of time: the diachrony of modality in the Latin...
-
Annotation of scientific uncertainty using linguistic patterns
Scientific uncertainty is an integral part of the research process and inherent to the construction of new knowledge. In this paper, we investigate...
-
A comprehensive examination of emoji usage in Mexican Spanish WhatsApp corpus: a mixed-methods Linguistic approach
The surge of emojis in computer-mediated communication (CMC) since 2011 presents a significant analytical challenge across various disciplines, such...
-
Linguistic annotation of Byzantine book epigrams
In this paper, we explore the feasibility of develo** a part-of-speech tagger for not-normalised, Byzantine Greek epigrams. Hence, we compared...
-
NewsCom-TOX: a corpus of comments on news articles annotated for toxicity in Spanish
In this article, we present the NewsCom-TOX corpus, a new corpus manually annotated for toxicity in Spanish. NewsCom-TOX consists of 4359 comments in...
-
A corpus of Persian literary text
Persian poetry has profoundly affected all periods of Persian literature and the literature of other countries as well. It is a fundamental vehicle...
-
Automatic annotation method of VR speech corpus based on artificial intelligence
With the rapid development of the Internet and artificial intelligence, the demand for data annotation becomes more and more urgent. In order to meet...
-
Temporal Relations at the Sentence and Text Genre Level: The Role of Linguistic Cueing and Non-linguistic Biases—An Annotation Study of a Bilingual Corpus
This study investigates the role of non-linguistic biases in the obligatory (verb tenses) and optional (discourse connectives) linguistic marking for...
-
Slovenian parliamentary corpus siParl
Parliamentary debates represent an essential part of democratic discourse and provide insights into various socio-demographic and linguistic...
-
FinnSentiment: a Finnish social media corpus for sentiment polarity annotation
Sentiment analysis and opinion mining are essential tasks with many prominent application areas, e.g., when researching popular opinions on products...
-
Cross-linguistically consistent semantic and syntactic annotation of child-directed speech
Corpora of child speech and child-directed speech (CDS) have enabled major contributions to the study of child language acquisition, yet semantic...
-
Design and construction of Guayaquil radio speech corpus (CHARG)
The present paper aims to describe the process of creating CHARG—Corpus de Habla Radiofónica de Guayaquil (the Guayaquil Radiophonic Speech Corpus)....
-
Using Semi-automatic Annotation Platform to Create Corpus for Argumentative Zoning
Argumentative Zoning (AZ) is a tool to extract salient information from scientific texts for further Natural Language Processing (NLP) tasks, e.g.... -
A Corpus of Quotation Element Annotation for Chinese Novels: Construction, Extraction and Application
Quotations or dialogues are important for literary works, like novels. In the famous ** Yong’s novels, about a half of all sentences contain... -
A morphologically annotated longitudinal corpus of spoken Czech child–adult interactions
The paper presents a longitudinal corpus of transcribed spontaneous child–adult interactions in Czech. It consists of 99,388 tokens in 42,103...
-
The Visual Language Research Corpus (VLRC): an annotated corpus of comics from Asia, Europe, and the United States
The Visual Language Research Corpus (VLRC) is a dataset of annotations of 376 stories from comics from the United States, northwestern Europe, and...
-
Syntactic annotation for Portuguese corpora: standards, parsers, and search interfaces
In the last two decades, four Portuguese syntactically annotated corpora were built along the lines initially defined for the Penn Parsed Historical...
-
Analyzing learner language: the case of the Hebrew Learner Essay Corpus
We present the Hebrew Learner Essay Corpus (HELEECS): an annotated corpus of Hebrew language argumentative essays authored by prospective...
-
The Najdi Arabic Corpus: a new corpus for an underrepresented Arabic dialect
This paper presents a new corpus for a dialect of Arabic spoken in the central region of Saudi Arabia: the Najdi Arabic Corpus. This is the first...
-
Evolving linguistic divergence on polarizing social media
Language change is influenced by many factors, but often starts from synchronic variation, where multiple linguistic patterns or forms coexist, or...