An investigation into feature construction to assist word sense disambiguation

Specia, Lucia; Srinivasan, Ashwin; Joshi, Sachindra; Ramakrishnan, Ganesh; das Graças Volpe Nunes, Maria

doi:10.1007/s10994-009-5114-x

An investigation into feature construction to assist word sense disambiguation

Published: 12 June 2009

Volume 76, pages 109–136, (2009)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

An investigation into feature construction to assist word sense disambiguation

Download PDF

Lucia Specia¹,
Ashwin Srinivasan²,
Sachindra Joshi²,
Ganesh Ramakrishnan³ &
…
Maria das Graças Volpe Nunes⁴

794 Accesses
14 Citations
Explore all metrics

Abstract

Identifying the correct sense of a word in context is crucial for many tasks in natural language processing (machine translation is an example). State-of-the art methods for Word Sense Disambiguation (WSD) build models using hand-crafted features that usually capturing shallow linguistic information. Complex background knowledge, such as semantic relationships, are typically either not used, or used in specialised manner, due to the limitations of the feature-based modelling techniques used. On the other hand, empirical results from the use of Inductive Logic Programming (ILP) systems have repeatedly shown that they can use diverse sources of background knowledge when constructing models. In this paper, we investigate whether this ability of ILP systems could be used to improve the predictive accuracy of models for WSD. Specifically, we examine the use of a general-purpose ILP system as a method to construct a set of features using semantic, syntactic and lexical information. This feature-set is then used by a common modelling technique in the field (a support vector machine) to construct a classifier for predicting the sense of a word. In our investigation we examine one-shot and incremental approaches to feature-set construction applied to monolingual and bilingual WSD tasks. The monolingual tasks use 32 verbs and 85 verbs and nouns (in English) from the SENSEVAL-3 and SemEval-2007 benchmarks; while the bilingual WSD task consists of 7 highly ambiguous verbs in translating from English to Portuguese. The results are encouraging: the ILP-assisted models show substantial improvements over those that simply use shallow features. In addition, incremental feature-set construction appears to identify smaller and better sets of features. Taken together, the results suggest that the use of ILP with diverse sources of background knowledge provide a way for making substantial progress in the field of WSD.

Article PDF

Semantic Unsupervised Learning for Word Sense Disambiguation

Exploiting Lexical Sensitivity in Performing Word Sense Disambiguation

Using WordNet-Based Word Sense Disambiguation to Improve MT Performance

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Agirre, E., & Lopez de Lacalle, O. (2007). UBC-ALM: combining k-NN with SVD for WSD. In Proceedings of the fourth international workshop on semantic evaluations (pp. 342–345).
Agirre, E., & Rigau, G. (1996). Word sense disambiguation using conceptual density. In 16th international conference on computational linguistics (pp. 16–22), Copenhagen.
Bar-Hillel, Y. (1960). Automatic translation of languages. In F. Alt, D. Booth, & R. E. Meagher (Eds.), Advances in computers. New York: Academic Press.
Google Scholar
Cai, J. F., Lee, W. S., & Teh, Y. W. (2007). NUS-ML: improving word sense disambiguation using topic features. In Proceedings of the fourth international workshop on semantic evaluations (pp. 249–252).
Ciaramita, M., & Johnson, M. (2004). Multi-component word sense disambiguation. In SENSEVAL-3: 3rd international workshop on the evaluation of systems for the semantic analysis of text (pp. 97–100), Barcelona.
Cottrell, G. W. (1989). A connectionist approach to word sense disambiguation. Research notes in artificial intelligence. San Mateo: Morgan Kaufmann.
Google Scholar
Davis, J., Ong, I., Struyf, J., Burnside, E., Page, D., & Costa, V. S. (2007). Change of representation for statistical relational learning. In International joint conferences on artificial intelligence.
Hand, D. J. (1997). Construction and assessment of classification rules. Chichester: Wiley.
MATH Google Scholar
Hirst, G. (1987). Semantic interpretation and the resolution of ambiguity. Studies in natural language processing. Cambridge: Cambridge University Press.
Google Scholar
John, G. H., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In Proceedings of the eleventh international conference on machine learning (pp. 121–129). San Mateo: Morgan Kaufmann.
Google Scholar
Kohavi, R., & John, G. H. (1995). Automatic parameter selection by minimizing estimated error. In 12th international conference on machine learning. San Francisco: Morgan Kaufmann.
Google Scholar
Kramer, S., Lavrac, N., & Flach, P. (2001). Propositionalization approaches to relational data mining. In S. Dzeroski & N. Lavrac (Eds.), Relational data mining (pp. 262–291). Berlin: Springer.
Google Scholar
Lamjiri, A., Demerdash, O., & Kosseim, F. (2004). Simple features for statistical word sense disambiguation. In SENSEVAL-3: 3rd international workshop on the evaluation of systems for the semantic analysis of text (pp. 133–136), Barcelona.
Landwehr, N., Passerini, A., De Raedt, L., & Frasconi, P. (2006). kFOIL: learning simple relational kernels. In Y. Gil & R. Mooney (Eds.), Proceedings of the twenty-first national conference on artificial intelligence.
Lavrac, N., Dzeroski, S., & Grobelnik, M. (1990). Learning nonrecursive definitions of relations with LINUS (Technical report). Jozef Stefan Institute.
Lesk, M. (1986). Automated sense disambiguation using machine-readable dictionaries: how to tell a pine cone from an ice cream cone. In SIGDOC conference (pp. 24–26), Toronto.
Lin, D. (1993). Principle based parsing without overgeneration. In 31st annual meeting of the association for computational linguistics (pp. 112–120), Columbus.
McRoy, S. (1992). Using multiple knowledge sources for word sense discrimination. Computational Linguistics, 18(1), 1–30.
Google Scholar
Mihalcea, R., Chklovski, T., & Kilgariff, A. (2004). The SENSEVAL-3 English lexical sample task. In SENSEVAL-3: 3rd international workshop on the evaluation of systems for semantic analysis of text (pp. 25–28), Barcelona.
Miller, G. A., Beckwith, R. T., Fellbaum, C. D., Gross, D., & Miller, K. (1990). Wordnet: an on-line lexical database. International Journal of Lexicography, 3(4), 235–244.
Article Google Scholar
Mohammad, S., & Pedersen, T. (2004). Complementarity of lexical and simple syntactic features: the syntalex approach to SENSEVAL-3. In SENSEVAL-3: 3rd international workshop on the evaluation of systems for the semantic analysis of text (pp. 159–162), Barcelona.
Muggleton, S. (1994). Inductive logic programming: derivations, successes and shortcomings. SIGART Bulletin, 5(1), 5–11.
Article Google Scholar
Muggleton, S., & De Raedt, L. (1994). Inductive logic programming: theory and methods. Journal of Logic Programming, 19(20), 629–679.
Article MathSciNet Google Scholar
Muggleton, S., Lodhi, H., Amini, A., & Sternberg, M. J. E. (2005). Support vector inductive logic programming. In 8th international conference on discovery science (pp. 163–175). Berlin: Springer.
Google Scholar
Niu, Z. Y., Ji, D. H., & Tan, C. L. (2007). I2R: three systems for word sense discrimination, Chinese word sense disambiguation, and English word sense disambiguation. In Proceedings of the fourth international workshop on semantic evaluations (pp. 177–182).
Nienhuys-Cheng, S., & de Wolf, R. (1997). Foundations of inductive logic programming. Berlin: Springer.
Google Scholar
Paes, A., Zaverucha, G., Page, C. D. Jr., & and Srinivasan, A. (2007). LNCS: Vol. 4455 ILP through propositionalization and stochastic k-term DNF learning. Sense disambiguation using inductive logic programming. Selected papers from the 16th international conference on inductive logic programming. Berlin: Springer, (pp. 379–393).
Google Scholar
Parker, J., & Stahel, M. (1998). Password: English dictionary for speakers of Portuguese. São Paulo: Martins Fontes.
Google Scholar
Pedersen, T. (2002). A baseline methodology for word sense disambiguation. In 3rd international conference on intelligent text processing and computational linguistics, Mexico City.
Pradhan, S., Loper, E., Dligach, D., & Palmer, M. (2007). SemEval-2007 Task-17: English lexical sample, SRL and all words. In Fourth international workshop on semantic evaluations (pp. 87–92), Prague.
Procter, P. (Ed.). (1978). Longman dictionary of contemporary English. Essex: Longman Group.
Google Scholar
Quillian, M. R. (1961). A design for an understanding machine. Colloquium of semantic problems in natural language. Cambridge: Cambridge University Press.
Google Scholar
Ratnaparkhi, A. (1996). A maximum entropy part-of-speech tagger. Empirical methods in NLP conference. Philadelphia: University of Pennsylvania Press.
Google Scholar
Schutze, H. (1998). Automatic word sense discrimination. Computational Linguistics, 24(1), 97–124.
MathSciNet Google Scholar
Siegel, S. (1956). Nonparametric statistics for the behavioural sciences. New York: McGraw-Hill.
Google Scholar
Specia, L. (2006a). A hybrid relational approach for WSD—first results. In Student research workshop at Coling-ACL (pp. 55–60), Sydney.
Specia, L. (2006b). A hybrid relational approach for WSD—first results. In Proceedings of the COLING/ACL 2006 student research workshop (pp. 55–60).
Specia, L., Nunes, M. G. V., & Stevenson, M. (2005). Exploiting parallel texts to produce a multilingual sense-tagged corpus for word sense disambiguation. In RANLP-05, Borovets (pp. 525–531).
Specia, L., Nunes, M. G. V., & Stevenson, M. (2007a). Learning expressive models for word sense disambiguation. In 45th annual meeting of the association for computational linguistics (pp. 41–48), Prague.
Specia, L., Nunes, M. G. V., Srinivasan, A., & Ramakrishnan, G. (2007b). Word sense disambiguation using inductive logic programming. In LNCS: Vol. 4455 Selected papers from the 16th international conference on inductive logic programming (pp. 409–423). Berlin: Springer.
Google Scholar
Specia, L., Nunes, M. G. V., Srinivasan, A., & Ramakrishnan, G. (2007c). USP-IBM-1 and USP-IBM-2: the ILP-based systems for lexical sample WSD in SemEval-2007. In 4th international workshop on semantic evaluations (pp. 442–445), Prague.
Specia, L., Das, G. M., Nunes, M. G. V., Srinivasan, A., & Ramakrishnan, G. (2007d). USP-IBM-1 and USP-IBM-2: the ILP-based systems for lexical sample WSD in SemEval-2007. In Proceedings of the fourth international workshop on semantic evaluations (pp. 442–445).
Srinivasan, A. (1999). The aleph manual. Available at http://www.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/.
Stevenson, M., & Wilks, Y. (2001). The interaction of knowledge sources for word sense disambiguation. Computational Linguistics, 27(3), 321–349.
Article Google Scholar
Wilks, Y., & Stevenson, M. (1997). Combining independent knowledge sources for word sense disambiguation. In 3rd conference on recent advances in natural language processing (pp. 1–7), Tzigov Chark.
Wilks, Y., & Stevenson, M. (1998). The grammar of sense: using part-of-speech tags as a first step in semantic disambiguation. Natural Language Engineering, 4(1), 1–9.
Article Google Scholar
Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. In 33rd annual meeting of the association for computational linguistics (189–196), Cambridge.
Zelezny, F., Srinivasan, A., & Page, C. D. Jr. (2006). Randomised restarted search in ILP. Machine Learning, 64(1–3), 183–208.
Article MATH Google Scholar
Železný, F. & Lavrač, N. (2006). Propositionalization-based relational subgroup discovery with RSD. Machine Learning, 62(1–2), 33–63.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Research Centre Europe, 6 Chemin de Maupertuis, Meylan, 38240, France
Lucia Specia
IBM India Research Laboratory, 4-C, Institutional Area, Vasant Kunj, New Delhi, 110 070, India
Ashwin Srinivasan & Sachindra Joshi
Department of Computer Science & Engineering, IIT Bambay, Poway, Mumbai, India
Ganesh Ramakrishnan
ICMC—Universidade de São Paulo, Trabalhador São-Carlense, 400, São Carlos, 13560-970, Brazil
Maria das Graças Volpe Nunes

Authors

Lucia Specia
View author publications
You can also search for this author in PubMed Google Scholar
Ashwin Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
Sachindra Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Maria das Graças Volpe Nunes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sachindra Joshi.

Additional information

Editors: Filip Zelezny and Nada Lavrac.

A.S. is also an Adjust Professor at the Department of Computer Science and Engineering, University of New South Wales; and a Visiting Professor at the Computing Laboratory, University of Oxford.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Specia, L., Srinivasan, A., Joshi, S. et al. An investigation into feature construction to assist word sense disambiguation. Mach Learn 76, 109–136 (2009). https://doi.org/10.1007/s10994-009-5114-x

Download citation

Received: 19 November 2008
Revised: 02 April 2009
Accepted: 17 April 2009
Published: 12 June 2009
Issue Date: July 2009
DOI: https://doi.org/10.1007/s10994-009-5114-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

An investigation into feature construction to assist word sense disambiguation

Abstract

Article PDF

Similar content being viewed by others

Semantic Unsupervised Learning for Word Sense Disambiguation

Exploiting Lexical Sensitivity in Performing Word Sense Disambiguation

Using WordNet-Based Word Sense Disambiguation to Improve MT Performance

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An investigation into feature construction to assist word sense disambiguation

Abstract

Article PDF

Similar content being viewed by others

Semantic Unsupervised Learning for Word Sense Disambiguation

Exploiting Lexical Sensitivity in Performing Word Sense Disambiguation

Using WordNet-Based Word Sense Disambiguation to Improve MT Performance

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation