Search Results - Springer

Sort By Newest First Oldest First

Reference Work Entry At a glance

Correction to: Language, Script, and Font Recognition

Owing to an unfortunate oversight the second author Niladri Sekhar Dash was missing in the initially published html version of this chapter. He has now been added.

Umapada Pal, Niladri Sekhar Dash in Handbook of Document Image Processing and Recognition (2014)

Download PDF (853 KB)
Reference Work Entry In depth

Language, Script, and Font Recognition

Automatic identification of a language within a text document containing multiple scripts and fonts is a challenging task, as it is not only linked with the shape, size, and style of the characters and symbols...

Umapada Pal, Niladri Sekhar Dash in Handbook of Document Image Processing and Recognition (2014)
Chapter and Conference Paper

A System for Recognition of Named Entities in Odia Text Corpus Using Machine Learning Algorithm

This paper presents a novel approach to recognize named entities in Odia corpus. The development of a NER system for Odia using Support Vector Machine is a challenging task in intelligent computing. NER aims a...

Bishwa Ranjan Das, Srikanta Patnaik… in Computational Intelligence in Data Mining … (2015)
Chapter and Conference Paper

Development of Odia Language Corpus from Modern News Paper Texts: Some Problems and Issues

In this paper, we have tried to describe the details about the strategies and methods we have adapted to design and develop a digital Odia corpus of newspaper texts. We have also attempted to identify the scop...

Bishwa Ranjan Das, Srikanta Patnaik… in Intelligent Computing, Communication and D… (2015)
Book

The WordNet in Indian Languages

Niladri Sekhar Dash, Pushpak Bhattacharyya… (2017)
Chapter

Language-specific Synsets and Challenges in Synset Linkage in Urdu WordNet

The Urdu WordNet is being developed following the process used to develop the Hindi WordNet by using the Expansion Approach. This paper, in the first part, presents some of our experiences that we gathered in ...

Rizwanur Rahman, Mazhar Mehdi Hussain… in The WordNet in Indian Languages (2017)
Chapter

Problems in Translating Hindi Synsets into the Bangla WordNet

In this chapter, I have made an attempt to look into the problems and challenges I have faced in develo** the Bangla synsets that will stand as conceptual equivalents for the Hindi synsets used in the IndoWo...

Niladri Sekhar Dash in The WordNet in Indian Languages (2017)
Chapter

Defining Language-Specific Synsets in IndoWordNet: Some Theoretical and Practical Issues

A WordNet is a digital network of semantically linked words, which are organized around the notion of synsets of a language. A synset is a set of synonyms with same part-of-speech (mostly), which are potential...

Niladri Sekhar Dash in The WordNet in Indian Languages (2017)
Chapter and Conference Paper

Application of TF-IDF Feature for Categorizing Documents of Online Bangla Web Text Corpus

This paper explores the use of standard features as well as machine learning approaches for categorizing Bangla text documents of online Web corpus. The TF-IDF feature with dimensionality reduction technique (...

Ankita Dhar, Niladri Sekhar Dash, Kaushik Roy in Intelligent Engineering Informatics (2018)
Chapter and Conference Paper

Categorization of Bangla Web Text Documents Based on TF-IDF-ICF Text Analysis Scheme

With the rapid growth and huge availability of digital text data, automatic text categorization or classification is a comparatively more effective solution in organizing and managing textual information. It i...

Ankita Dhar, Niladri Sekhar Dash, Kaushik Roy in Social Transformation – Digital Way (2018)
Chapter

Features of a Corpus

Defining the characteristic features of a corpus, in general, has been an issue of great debate for decades. Due to diversities involved in the types of text used for corpus generation, identification of featu...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Pre-digital Corpora (Part 2)

Following the footsteps of the previous chapter (Chap. 9), in this chapter, we have presented a short description of the process of corpus generation and utilization in ...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Nature of Data

It is always difficult to define the nature of language data since language texts often possess multiple properties, due to which the nature of a particular text may overlap with that of another. However, sinc...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Digital Text Corpora (Part 2)

The generation of text corpora is not confined to a few widely privileged languages such as English, French, German or Spanish. Many lesser-known and under-privileged languages are also emerging with corpora o...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Nature of Text Application

In this chapter, we have sketched out how language corpora can be classified based on the nature of the application of texts at various domains of linguistics and language technology. We have argued that a ‘pa...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Utilization of Language Corpora

Even after nearly 70 years, the staunch supporters of the generative genre still like to argue that linguistics is a branch of intuition and introspection where corpora, as a showcase of empirical language dat...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Web Text Corpus

The World Wide Web is viewed as a useful linguistic resource since it is a unique linguistic world that is full of surprising linguistic data and information. It is the largest store of texts in existence that...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Definition of ‘Corpus’

Understanding the concept of ‘corpus’ has been one of the challenging issues in corpus linguistics in recent times. Language users are often confused with the concept, and as a result of this, they sometimes c...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Genre of Text

Classification of corpus based on genre is a difficult theoretical exercise which is carried out in this chapter. In this chapter, we have first justified why it is necessary to classify corpora based on certa...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)
Chapter

Digital Text Corpora (Part 1)

The history of digital text corpus generation and usage presents an interesting narrative. It shows how technology has brought about a resurgence in the discipline of linguistics, which was otherwise turning i...

Niladri Sekhar Dash, S. Arulmozi in History, Features, and Typology of Language Corpora (2018)

51 Result(s)

Correction to: Language, Script, and Font Recognition

Language, Script, and Font Recognition

A System for Recognition of Named Entities in Odia Text Corpus Using Machine Learning Algorithm

Development of Odia Language Corpus from Modern News Paper Texts: Some Problems and Issues

The WordNet in Indian Languages

Language-specific Synsets and Challenges in Synset Linkage in Urdu WordNet

Problems in Translating Hindi Synsets into the Bangla WordNet

Defining Language-Specific Synsets in IndoWordNet: Some Theoretical and Practical Issues

Application of TF-IDF Feature for Categorizing Documents of Online Bangla Web Text Corpus

Categorization of Bangla Web Text Documents Based on TF-IDF-ICF Text Analysis Scheme

Features of a Corpus

Pre-digital Corpora (Part 2)

Nature of Data

Digital Text Corpora (Part 2)

Nature of Text Application

Utilization of Language Corpora

Web Text Corpus

Definition of ‘Corpus’

Genre of Text

Digital Text Corpora (Part 1)

Our Content

Other Sites

Help & Contacts