Abstract
Today, one of the key tasks for Web based learning is to retrieve research articles of specific domains. To accomplish this task, data mining techniques and semantic web technologies can be used to retrieve user relevant documents. In this work we extract author supplied keywords from a collection of computer science articles, which has a strong influence on the topic of the article when compared with other words. For these keywords term weight is computed using Fuzzy Logic which uses three criteria namely map** with concept in domain ontology, keyword frequency in the title and keyword frequency in abstract. Using domain ontology, keywords of each document with their term weights are represented hierarchically as XML documents and they are clustered. The experimental results show that the proposed technique yields better precision and recall rates when compared with some of the existing approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ceravolo, P., Nocerino, M.C., Viviani, M.: Knowledge extraction from semi-structured data based on fuzzy techniques. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3215, pp. 328–334. Springer, Heidelberg (2004)
Dalamagas, T., Cheng, T., Winkel, K.J., Sellis, T.: A methodology for clustering XML documents by structure. Inf. Syst. 31(3), 187–228 (2006)
Damiani, E., Nocerino, M.C., Viviani, M.: Knowledge extraction from an XML data flow: building a taxonomy based on clustering technique. In: Proceedings of EUROFUSE 2004: Eighth Meeting EURO Working Group on Fuzzy Sets, pp. 133–142 (2004)
Dogac, A., Laleci, G.B., Kabak, Y., Cingi, I.: Exploiting web service semantics: taxonomies vs. ontologies. IEEE Data Eng. Bull. 25(4), 10–14 (2002)
Fekete, J.D., Grinstein, G., Plaisant, C.: IEEE InfoVis 2004 Contest. In: The History of InfoVis (2004)
Ghosh, P.M., Mitra, P.: Combining content and structure similarity for XML document. In: Proceedings of the International Conference on Pattern Recognition, pp. 1–4 (2008)
Gil-García, R., Badia-Contelles, J.M., Pons-Porrata, A.: A general framework for agglomerative hierarchical clustering algorithms. In: Proceedings of 18th International Conference on Pattern Recognition ICPR, vol. 2, pp. 569–572 (2006)
Guerrini, G., Mesiti, M., Sanz, I.: An overview of similarity measures for clustering XML documents. In: Web Data Management Practices: Emerging Techniques and Technologies, pp. 56–78 (2007)
Jeong, B., Lee, D., Cho, H., Lee, J.: A novel method for measuring semantic similarity for XML matching. Expert Syst. Appl. 34(3), 1651–1658 (2008)
Leung, H.P., Chung, F.L., Chan, S.C.: A new sequential mining approach to XML document similarity computation. In: Proceedings of the 7th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Seoul, Korea, pp. 569–569 (2003)
Nagypál, G.: Improving information retrieval effectiveness by using domain knowledge stored in ontologies. In: Meersman, R., Tari, Z. (eds.) OTM-WS 2005. LNCS, vol. 3762, pp. 780–789. Springer, Heidelberg (2005)
Nierman, A., Jagadish, H.V.: Evaluating structural similarity in XML documents. In: Proceedings of 5th International Workshop on the Web and Databases (WebDB 2002), Madison, Wisconsin, USA, pp. 61–66 (2002)
Pan, H., Tan, X., Han, A., Yin, G.: A domain knowledge based approach for medical image retrieval. Int. J. Inf. Eng. Electron. Bus. 3, 16–22 (2011)
Paralic, J., Kostial, I.: Ontology-based information retrieval. In: 14th International Conference on Information and Intelligent Systems, Varazdin, Croatia, pp. 23–28 (2003)
Paukkeri, M.-S.: Learning a taxonomy from a set of text documents. In: Applied Soft Computing, pp. 1138–1148 (2011)
Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic Indexing. Commun. ACM 18(11), 613–620 (1975)
Sølvberg, I., Nordbø, I., Aamodt, A.: Knowledge-based information retrieval. Future Gener. Comput. Syst. 7, 379–390 (1992)
Tekli, J., Chbeir, R., Yétongnon, K.: A hybrid approach for XML similarity. In: van Leeuwen, J., Italiano, G.F., van der Hoek, W., Meinel, C., Sack, H., Plášil, F. (eds.) SOFSEM 2007. LNCS, vol. 4362, pp. 783–795. Springer, Heidelberg (2007)
Tekli, J., Chbeir, R., Yetongnon, K.: An overview on XML similarity: background, current trends and future directions. Comput. Sci. Rev. 3(3), 151–173 (2009)
Tran, T., Nayak, R., Bruza, P.: Combining structure and content similarities for XML document clustering. In: Proceedings of the 7th Australasian Data Mining Conference (AusDM), vol. 87, pp. 219–226 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Periakaruppan, R., Nadarajan, R. (2015). Automatic Clustering of Research Articles Using Domain Ontology and Fuzzy Logic. In: Chiu, D., et al. Advances in Web-Based Learning – ICWL 2013 Workshops. ICWL 2013. Lecture Notes in Computer Science(), vol 8390. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46315-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-46315-4_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46314-7
Online ISBN: 978-3-662-46315-4
eBook Packages: Computer ScienceComputer Science (R0)