Abstract
Supersense tagging is a problem of finding a corresponding semantic super tag (eg. Phenomenon, Act) based on syntactic information and annotated corpora. However, we employ semantic information rather than syntactic one and annotated corpora, because Korean language has relatively flexible syntactic structure and is lack of annotated corpora. To construct the automatic sense tagging system for Korean language, we use semi-supersenses of first and second level in Sejong’s Noun Semantic Class System. We employ a hybrid approach consisting of three phases: one morphological matching phase and two semantic matching phases. The morphological phase is based on suffix pattern matching which assigns compound word to the class including the suffix word. One of the two semantic matching phases is based on concept similarity on WordNet, and the other is based on the term similarity in term matrix reduced by singular value decomposition (SVD). Above semantic phases are using weighted k-Nearest Neighbor classifier commonly but are also using different similarity metrics. In experiments, 79,103 unknown words are extracted from 225,779 noun words from syntactic tagged corpus, and 98% of the unknown words are addressed by our hybrid method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Picca, D., Popescu, A.: Using Wikipedia and supersense tagging forsemi-automatic complex taxonomy construction. In: RANLP 2007, CALP Workshop (2007)
Collins, M., Singer, Y.: Unsupervised Models for Named Entity Classification. In: The Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora. (1999)
Ciaramita, M., Johnson, M.: Supersense Tagging of Unknown Nouns in WordNet. In: The 2003 Conference of Empirical Methods in Natural Language Processing, pp. 168–175 (2003)
Marrero, M., Sanchez-Cuadrado, S., Lara, J., Andreadakis, G.: Evaluation of Named Entity Extraction Systems. Advances in Computational Linguistics, Research in Computing Science 41, 47–58 (2009)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Curran, J.: Supersense Tagging of Unknown Nouns Using Semantic Similarity. In: ACL 2005, pp. 26–33 (2005)
Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, Boston (1994)
Picca, D., Gliozzo, A.M., Ciaramita, M.: Supersense Tagger for Italian. In: The Sixth International Conference on Language Resources and Evaluation (2008)
Lu, X.: Hybrid Models for Semantic Classificaiton of Chinese Unknown Words. In: NAACL HLT 2007, pp. 188–195 (2007)
Chen, K.-J., Chen, C.-J.: Automatic Semantic Classification for Chinese Unknown Compound Nouns. In: COLING 2000, pp. 173–179 (2000)
Ciaramita, M., Altun, Y.: Broad-Coverage Sense Disambiguation and Information Extraction with a Supersense Sequence Tagger, Source. In: The Conference on Empirical Methods in Natural Language Processing (2006)
Sowa, J.: Knowledge Representation: Logical Philosophical and Computational Foundations. Brooks and Cole (1999)
Leaock, C., Chodrow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. In: Fellbaum, pp. 265–283 (1998)
Cristianini, N., Shawe-tayler, J., Lodhi, H.: Latent Semantic Kernel. Journal of Intelligent Information Systems 18(2-3), 127–152 (2002)
ETRI: POS Tag Guidelines. Technical Report, ETRI, Taejun, Korea (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, YB., Lee, JK., Kim, YS. (2012). A Multi-phase Semi-supersense Tagging of Korean Unknown Nouns. In: Lee, G., Howard, D., Ślęzak, D., Hong, Y.S. (eds) Convergence and Hybrid Information Technology. ICHIT 2012. Communications in Computer and Information Science, vol 310. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32692-9_97
Download citation
DOI: https://doi.org/10.1007/978-3-642-32692-9_97
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32691-2
Online ISBN: 978-3-642-32692-9
eBook Packages: Computer ScienceComputer Science (R0)