Discovering Conceptual Page Hierarchy of a Web Site from User Traversal History

Chen, **a; Li, Minqiang; Zhao, Wei; Chen, Ding-Yi

doi:10.1007/11527503_64

**a Chen²¹,
Minqiang Li²²,
Wei Zhao²¹ &
…
Ding-Yi Chen²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2346 Accesses
1 Citations

Abstract

A Web site generally contains a wide range of topics which provide information for users who have different access interests and goals. This information is not randomly scattered, but well organized under a hierarchy encoded in the hyperlink structure of a Web site. It is intended to mold the user’s mental models of how the information is organized. On the other hand, user traversals over hyperlinks between Web pages can reveal semantic relationships between these pages. Unfortunately, the link structure of a Web site which represent the Web designer’s expectation on visitors may be quite different from the organization expected by visitors to this site. Discovering the conceptual page hierarchy from a user’s angle can help web masters to have an sight into real relationships among the Web pages and refine the link structure of the Web site to facilitate effective user navigation. In this paper, we propose a method to generate a conceptual page hierarchy of a Web site on the basis of user traversal history. We use maximal forward references to model user’s traversal behavior over the underlying link hierarchy of a Web site. We then build a weighted directed graph to represent the inter-relationships between Web pages. Finally we apply a “Maximum Spanning Tree” (MST) algorithm to generate a conceptual page hierarchy of the Web site. We demonstrate the effectiveness of our approach by conducting a preliminary experiment based on a real world Web data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (France)

eBook: EUR 85.59; Price includes VAT (France)

Softcover Book: EUR 105.49; Price includes VAT (France)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Generation of Sitemaps Based on Navigation Systems

HiveRel: hexagons visualization for relationship-based knowledge acquisition

Article 11 April 2022

Beyond Graph Search: Exploring and Exploiting Rich Connected Data Sets

References

Chen, M., Park, J., Yu, P.: Efficient data mining for path traversal patterns. IEEE Trans. on Knowledge and Data Engineering, TKDE (1998)
Google Scholar
Zeng, H.J., Chen, Z., Ma, W.Y.: A unified framework for clustering heterogeneousweb objects. In: WISE (2002)
Google Scholar
Chen, M., LaPaugh, A., Singh, J.P.: Categorizing information objects from user access patterns. In: The Eleventh International Conference on Information and Knowledge Management (2002)
Google Scholar
Kath, A., Smith, A.N.: Web page clustering using a self-organizing map of user navigation patterns. Decision Support Systems, Special issue: Web data mining 35 (2003)
Google Scholar
Shahabi, C., Zarkesh, A.M., Adibi, J., Shah, V.: Knowledge discovery from users web-page navigation. In: IEEE Workshop Research Issues in Data Engineering, pp. 20–29
Google Scholar
Perkowitz, M., Etzioni, O.: Adaptive web sites: Automatically synthesizing web pages. In: The Fifteenth National Conf. on Artificial Intelligence (AAAI), pp. 727–732
Google Scholar
Su, Z., Yang, Q., Zhang, H.J., Xu, X., Hu, Y.H.: Correlation-based document clustering using web logs. In: The 34th Hawaii International Conference On System Sciences(HICSS-34), January 3-6 (2001)
Google Scholar
Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Technical Report, TR99-010, Department of Computer Science, Depaul University (1999)
Google Scholar
Nakayama, T., Kato, H., Yamane, Y.: Discovering the gap between web site designers’ expectations and users’ behavior. In: The Ninth Int’l World Wide Web Conference, Amsterdam
Google Scholar
Srikant, R., Yang, Y.: Mining web logs to improve website organization. In: WWW (2001)
Google Scholar
Rohlf, F.J.: Algorithm 76: Hierarchical clustering using the minimum spanning tree. Computing (1973)
Google Scholar
Chu, Y.J., Liu, T.H.: On the shortest arborescence of a directed graph. Science Sinica (1965)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Tian** University, Tian**, 300072, P.R. China
**a Chen & Wei Zhao
School of Management, Tian** University, Tian**, 300072, P.R. China
Minqiang Li
School of Information Technology and Electrical Engineering, University of Queensland, QLD, 4072, Australia
Ding-Yi Chen

Authors

**a Chen
View author publications
You can also search for this author in PubMed Google Scholar
Minqiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Ding-Yi Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
The State Key Laboratory for Information Engineering in Surveying, Map** and Remote Sensing, Wuhan University, 430072, Wuhan, China
Shuliang Wang
School of ITEE, The Univ of Queensland, St. Lucia, 4072, QLD, Australia
Zhao Yang Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Li, M., Zhao, W., Chen, DY. (2005). Discovering Conceptual Page Hierarchy of a Web Site from User Traversal History. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_64

Download citation

DOI: https://doi.org/10.1007/11527503_64
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics