Multilingual Information Access in South Asian Languages
Second International Workshop, FIRE 2010, Gandhinagar, India, February 19-21, 2010 and Third International Workshop, FIRE 2011, Bombay, India, December 2-4, 2011, Revised Selected Papers
Article
Interlingua and transfer-based approaches tomachine translation have long been in use in competing and complementary ways. The former proves economical in situations where translation among multiple languages ...
Chapter and Conference Paper
Measuring semantic nearness of documents is important for accurate information retrieval, automated text categorization and classification. Inspired by the observation that text documents contain semantically coh...
Article
Wordnets, which are repositories of lexical semantic knowledge containing semantically linked synsets and lexically linked words, are indispensable for work on computational linguistics and natural language pr...
Chapter and Conference Paper
In this paper, we present our Hindi to English and Marathi to English CLIR systems developed as part of our participation in the CLEF 2007 Ad-Hoc Bilingual task. We take a query translation based approach usin...
Chapter and Conference Paper
It is well known that pseudo-relevance feedback (PRF) improves the retrieval performance of Information Retrieval (IR) systems in general. However, a recent study by Cao et al [3] has shown that a non-negligib...
Chapter and Conference Paper
This paper aims to present a way of storing Sanskrit Verbal roots in a proposed Sanskrit WordNet. The synsets of verbal roots are proposed to be created using all the available dhātupāṭhas. While doing so, it ...
Chapter and Conference Paper
Glosses and examples are the essential components of the computational lexical databases like, Wordnet. These two components of the lexical database can be used in building domain ontologies, semantic relation...
Chapter and Conference Paper
Stemming is considered crucial in many NLP and IR applications. In the absence of any linguistic information, stemming is a challenging task. Stemming of words using suffixes of a language as linguistic inform...
Chapter and Conference Paper
In this paper, we present a novel approach to identify feature specific expressions of opinion in product reviews with different features and mixed emotions. The objective is realized by identifying a set of pote...
Chapter and Conference Paper
This paper describes a weakly supervised system for sentiment analysis in the movie review domain. The objective is to classify a movie review into a polarity class, positive or negative, based on those sentences...
Book and Conference Proceedings
Second International Workshop, FIRE 2010, Gandhinagar, India, February 19-21, 2010 and Third International Workshop, FIRE 2011, Bombay, India, December 2-4, 2011, Revised Selected Papers
Chapter and Conference Paper
Recently there has been a lot of interest in Cross Language Sentiment Analysis (CLSA) using Machine Translation (MT) to facilitate Sentiment Analysis in resource deprived languages. The idea is to use the anno...
Chapter
Languages of the world, though different, share structures and vocabulary. Today’s NLP depends crucially on annotation which, however, is costly, needing expertise, money and time. Most languages in the world ...
Book
Chapter
Gujarati WordNet is built from the Hindi WordNet using the expansion approach. This paper presents experiences of building Gujarati WordNet. Various crucial issues relating to synset generation and linkage as ...
Chapter and Conference Paper
The current paper reports about the development of an automatic clustering technique which builds upon the search capability of a self-organizing multi-objective differential evolutionary approach. The algorit...
Chapter
Marathi is the language spoken primarily by the native people of Maharashtra, a state of Indian subcontinent. There are about 90 million people who speak Marathi worldwide. It is the oldest of the Indo-Aryan r...
Chapter
India is a multilingual country where machine translation and cross-lingual search are highly relevant problems. These problems require large resources—such as WordNets and lexicons—of high quality and coverag...
Chapter
Sentiment lexicons and datasets represent the knowledge base that lies at the foundation of a SA system. In its simplest form, a sentiment lexicon is a repository of words/phrases labelled with sentiment. Simi...
Chapter
In a multilingual country such as India, machine translation and crosslingual search are highly relevant problems. The WordNets, as crucial linguistic resources, play the most dominant role in the field of tex...