Search
Search Results
-
Social media emotions annotation guide (SMEmo): Development and initial validity
The proper measurement of emotion is vital to understanding the relationship between emotional expression in social media and other factors, such as...
-
Towards the development of an automated robotic storyteller: comparing approaches for emotional story annotation for non-verbal expression via body language
Storytelling is a long-established tradition and listening to stories is still a popular leisure activity. Caused by technization, storytelling media...
-
Overview of Linguistic Information
This chapter provides a brief overview of linguistic information, and its relevance to clinical text processing. The contents of this chapter will be... -
Corpus Construction for Historical Newspapers: A Case Study on Public Meeting Corpus Construction Using OCR Error Correction
Large text corpora are indispensable for natural language processing. However, in various fields such as literature and humanities, many documents to...
-
Register identification from the unrestricted open Web using the Corpus of Online Registers of English
This article examines the automatic identification of Web registers, that is, text varieties such as news articles and reviews. Most studies have...
-
The interactive Leipzig Corpus Miner: An extensible and adaptable text analysis tool for content analysis
We present the interactive Leipzig Corpus Miner (iLCM), which is the result of the development of an integrated research environment for the analysis...
-
Corpus-Based Analysis of Lexical Features of Mongolian Language Policy Text
Like other policy texts, language policy texts also need policy text analysis. Leveraging a corpus of 100 policy documents, this study investigates... -
Introduction: Researching Corpus Pragmatics in Irish English
This Introduction to the Special Issue describes the research background to Irish English Corpus Pragmatics. It also gives a brief overview of the...
-
Corpus tools for parallel corpora of theatre plays: an introduction to TAligner and ACM-theatre
Software tools are of vital importance in corpus-based research, but they can also lead to restrictions on the type of supported corpora and the...
-
A Chinese Dialogue Corpus Annotated with Dialogue Act
This chapter will introduce a Chinese dialogue corpus with annotated dialogue acts and users’ intent, which contains 5026 multi-turn and multiplayer... -
I beg to differ: how disagreement is handled in the annotation of legal machine learning data sets
Legal documents, like contracts or laws, are subject to interpretation. Different people can have different interpretations of the very same...
-
Hong Kong Corpus of Chinese Sentence and Passage Reading
Recent years have witnessed a mushrooming of reading corpora that have been built by means of eye tracking. This article showcases the Hong Kong...
-
Building translator-oriented English-Arabic physics glossary from domain corpus
Recent growth in scientific web-content makes it easier for translators to get the most popular equivalent of a scientific term. However, this...
-
Cognitive and social well-being in older adulthood: The CoSoWELL corpus of written life stories
This paper presents the Cognitive and Social WELL-being (CoSoWELL) project that consists of two components. One is a large corpus of narratives...
-
Emergent Pragmatic Conventions in Spoken ELF Corpus Data: Micro-Diachronic Analysis of Inclusive vs. Exclusive Multilingual Practices
This article examines multilingual practices as an example of emergent pragmatic conventions in three Transient International Groups (TIGs) using...
-
Automatic consistency assurance for literature-based gene ontology annotation
BackgroundLiterature-based gene ontology (GO) annotation is a process where expert curators use uniform expressions to describe gene functions...
-
Elaboration of a new framework for fine-grained epidemiological annotation
Event-based surveillance (EBS) gathers information from a variety of data sources, including online news articles. Unlike the data from formal...
-
of Multimodal English Corpus Based on Functional Software
Multimode nursing English database is established based on audio-visual data, which can reflect the comprehensive application of nursing English. It... -
Study of Chinese Words in Diachronic Corpus of Newspaper
This research examines the vocabulary used in Chinese newspapers using a diachronic corpus spanning 77 years, from 1872 to 1949. The Zipfian... -
Identifying stance in legislative discourse: a corpus-driven study of data protection laws
Mirroring public ideologies and value systems in legislative discourse, stance not only functions as a powerful instrument for legislators to balance...