Search Page | SpringerLink

Investigating interoperable event corpora: limitations of reusability of resources and portability of models

Studies on the applicability of heterogeneous semantically interoperable corpora are rare. We investigate to what extent reusability (both of systems...

Tommaso Caselli, Johan Bos in Language Resources and Evaluation

Article Open access 26 February 2023

Multiple annotation for biodiversity: develo** an annotation framework among biology, linguistics and text technology

Biodiversity information is contained in countless digitized and unprocessed scholarly texts. Although automated extraction of these data has been...

Andy Lücking, Christine Driller, ... Alexander Mehler in Language Resources and Evaluation

Article Open access 04 August 2021

The CLARIN infrastructure as an interoperable language technology platform for SSH and beyond

CLARIN is a European Research Infrastructure Consortium develo** and providing a federated and interoperable platform to support scientists in the...

A. Branco, M. Eskevich, ... C. Zinn in Language Resources and Evaluation

Article Open access 12 June 2023

Beyond lexical frequencies: using R for text analysis in the digital humanities

This paper presents a combination of R packages—user contributed toolkits written in a common core programming language—to facilitate the humanistic...

Taylor Arnold, Nicolas Ballier, ... Lauren Tilton in Language Resources and Evaluation

Article 08 April 2019

TEI-friendly annotation scheme for medieval named entities: a case on a Spanish medieval corpus

Medieval documents are a rich source of historical data. Performing named-entity recognition (NER) on this genre of texts can provide us with...

Elena Álvarez-Mellado, María Luisa Díez-Platas, ... Elena González-Blanco in Language Resources and Evaluation

Article Open access 27 February 2021

From Original Sources to Linguistic Analysis: Tools and Datasets for the Investigation of Multilingualism in Medieval English

This chapter presents an outline of some of the different types of digital datasets and tools that are currently available to help researchers in the...

Carola Trips, Peter A. Stokes in Medieval English in a Multilingual Context

Chapter 2023

Entity normalization in a Spanish medical corpus using a UMLS-based lexicon: findings and limitations

Entity normalization is a common strategy to resolve ambiguities by map** all the synonym mentions to a single concept identifier in standard...

Pablo Báez, Leonardo Campillos-Llanos, ... Jocelyn Dunstan in Language Resources and Evaluation

Article 02 July 2024

A flexible tool for a qualia-enriched FrameNet: the FrameNet Brasil WebTool

In this paper we present a database management and annotation tool for running an enriched FrameNet database, the FrameNet Brasil WebTool. We...

Tiago Timponi Torrent, Ely Edison da Silva Matos, ... Vanessa Maria Ramos Lopes Paiva in Language Resources and Evaluation

Article 22 January 2024

Evaluating the FAIRness of Scientific Data Repositories

Evaluation of FAIRness of scientific data repositories is a growing concern. FAIRness means making data compatible with FAIR data principles (M. D....

Paulo V. C. Amaral, Frederico Alan de Oliveira Cruz, Sérgio Manuel Serra da Cruz in Digital Humanities Looking at the World

Chapter 2024

Conducting a Multivocal Systematic Literature Review About Compliance with the Brazilian Law for General Data Protection

A Multivocal Systematic Literature Review (MSLR) is a form of Systematic Literature Review (SLR) that includes gray literature in addition to formal...

Roberta Cláudia de Jesus Bordalo, Hugo do Val F. Fernandes, Mônica Ferreira da Silva in Digital Humanities Looking at the World

Chapter 2024

Chinese Language Resources Through One-Third of a Century

This chapter provides a comprehensive overview of the co-development of Chinese language resources and Chinese language processing in the past three...

Chu-Ren Huang in Chinese Language Resources

Chapter 2023

Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations

This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web...

Manuela Sanguinetti, Cristina Bosco, ... Amir Zeldes in Language Resources and Evaluation

Article Open access 20 February 2022

Multi-layered semantic annotation and the formalisation of annotation schemas for the investigation of modality in a Latin corpus

This paper stems from the project A World of Possibilities. Modal pathways over an extra-long period of time: the diachrony of modality in the Latin...

Helena Bermúdez-Sabel, Francesca Dell’Oro, Paola Marongiu in Language Resources and Evaluation

Article 06 January 2024

The ParlaMint corpora of parliamentary proceedings

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion...

Tomaž Erjavec, Maciej Ogrodniczuk, ... Darja Fišer in Language Resources and Evaluation

Article Open access 02 February 2022

A multilingual, multimodal dataset of aggression and bias: the ComMA dataset

In this paper, we discuss the development of a multilingual dataset annotated with a hierarchical, fine-grained tagset marking different types of...

Ritesh Kumar, Shyam Ratan, ... Akanksha Bansal in Language Resources and Evaluation

Article 16 November 2023

LexO: an open-source system for managing OntoLex-Lemon resources

The adoption of Semantic Web technologies and the Linked Data paradigm has been driven by the need to ensure the construction of resources that are...

Andrea Bellandi in Language Resources and Evaluation

Article 27 June 2021

Using a Moodle-Based Digital Escape Room to Train Competent EMI Lecturers and Instructors in a Multilingual Environment

Professional development for teachers in English medium instruction (EMI) universities is challenging in a multilingual and multicultural...

Na Li, **aojun Zhang in Multilingual Education Yearbook 2023

Chapter 2023

Finnish parliament ASR corpus

Public sources like parliament meeting recordings and transcripts provide ever-growing material for the training and evaluation of automatic speech...

Anja Virkkunen, Aku Rouhe, ... Mikko Kurimo in Language Resources and Evaluation

Article Open access 27 March 2023

Democratizing neural machine translation with OPUS-MT

This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into...

Jörg Tiedemann, Mikko Aulamo, ... Sami Virpioja in Language Resources and Evaluation

Article Open access 13 December 2023

Syntactic annotation for Portuguese corpora: standards, parsers, and search interfaces

In the last two decades, four Portuguese syntactically annotated corpora were built along the lines initially defined for the Penn Parsed Historical...

Pablo Faria, Charlotte Galves, Catarina Magro in Language Resources and Evaluation

Article 26 December 2023

Search

Filters

Search Results

Search

Navigation