Search
Search Results
-
A study on methods for revising dependency treebanks: in search of gold
Reliably annotated corpora with reliable annotation are a valuable resource for Natural Language Processing, which justifies the search for methods...
-
SCTB-V2: the 2nd version of the Chinese treebank in the scientific domain
Word segmentation, part-of-speech (POS) tagging, and syntactic parsing are three fundamental Chinese analysis tasks for Chinese language processing,...
-
Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations
This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web...
-
Resources for Turkish dependency parsing: introducing the BOUN Treebank and the BoAT annotation tool
In this paper, we introduce the resources that we developed for Turkish dependency parsing, which include a novel manually annotated treebank (BOUN...
-
Two languages, one treebank: building a Turkish–German code-switching treebank and its challenges
This paper presents the SAGT Turkish–German code-switching treebank, and observations and annotation challenges we encountered during its...
-
Universal Dependencies for Mandarin Chinese
This article presents a Universal Dependency (UD) annotation scheme for Mandarin Chinese, as well as the current UD Chinese HK treebank. Our focus is...
-
Spoken Spanish PoS tagging: gold standard dataset
The development of a benchmark for part-of-speech (PoS) tagging of spoken dialectal European Spanish is presented, which will serve as the foundation...
-
Resources for Turkish natural language processing: A critical survey
This paper presents a comprehensive survey of corpora and lexical resources available for Turkish. We review a broad range of resources, focusing on...
-
From extended chunking to dependency parsing using traditional Arabic grammar
We describe in this paper the adopted approach combining a phrase structure grammar and dependency rules to develop AlkhalilPArser. The general...
-
Development and evaluation of an Urdu treebank (CLE-UTB) and a statistical parser
A number of natural language processing tools for Urdu language processing have been developed in the past few years for word segmentation, part of...
-
Linguistic annotation of Byzantine book epigrams
In this paper, we explore the feasibility of develo** a part-of-speech tagger for not-normalised, Byzantine Greek epigrams. Hence, we compared...
-
Arbobanko - A Treebank for Esperanto
In this paper we describe and evaluate Arbobanko, a syntactic treebank for the artificial language Esperanto, as well as methods and tools used to... -
Cross-Framework Evaluation for Portuguese POS Taggers and Parsers
This work compares POS and parsing systems for the Portuguese language. We analyse available features, tagsets, and compare the results of POS... -
Training and evaluation of vector models for Galician
This paper presents a large and systematic assessment of distributional models for Galician. To this end, we have first trained and evaluated static...
-
Chinese Language Resources Through One-Third of a Century
This chapter provides a comprehensive overview of the co-development of Chinese language resources and Chinese language processing in the past three... -
The Construction of a Chinese Semantic Dependency Graph Bank
Semantic dependency parsing is a deep semantic analysis task based on large-scale and canonically annotated corpora. This chapter will present a new... -
Chinese Language Resources: A Comprehensive Compendium
This chapter will present a collective effort to compile a comprehensive repository of accessible Chinese language resources that can be used online,... -
A Semi-supervised Approach for Chinese Noun Phrase Chunking
This chapter addresses Chinese noun phrase chunking with special reference to nominalizations based on a semi-supervised approach. It uses YamCha, a...