Discovering Conceptual Page Hierarchy of a Web Site from User Traversal History

  • Conference paper
Advanced Data Mining and Applications (ADMA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

Abstract

A Web site generally contains a wide range of topics which provide information for users who have different access interests and goals. This information is not randomly scattered, but well organized under a hierarchy encoded in the hyperlink structure of a Web site. It is intended to mold the user’s mental models of how the information is organized. On the other hand, user traversals over hyperlinks between Web pages can reveal semantic relationships between these pages. Unfortunately, the link structure of a Web site which represent the Web designer’s expectation on visitors may be quite different from the organization expected by visitors to this site. Discovering the conceptual page hierarchy from a user’s angle can help web masters to have an sight into real relationships among the Web pages and refine the link structure of the Web site to facilitate effective user navigation. In this paper, we propose a method to generate a conceptual page hierarchy of a Web site on the basis of user traversal history. We use maximal forward references to model user’s traversal behavior over the underlying link hierarchy of a Web site. We then build a weighted directed graph to represent the inter-relationships between Web pages. Finally we apply a “Maximum Spanning Tree” (MST) algorithm to generate a conceptual page hierarchy of the Web site. We demonstrate the effectiveness of our approach by conducting a preliminary experiment based on a real world Web data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 85.59
Price includes VAT (France)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 105.49
Price includes VAT (France)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chen, M., Park, J., Yu, P.: Efficient data mining for path traversal patterns. IEEE Trans. on Knowledge and Data Engineering, TKDE (1998)

    Google Scholar 

  2. Zeng, H.J., Chen, Z., Ma, W.Y.: A unified framework for clustering heterogeneousweb objects. In: WISE (2002)

    Google Scholar 

  3. Chen, M., LaPaugh, A., Singh, J.P.: Categorizing information objects from user access patterns. In: The Eleventh International Conference on Information and Knowledge Management (2002)

    Google Scholar 

  4. Kath, A., Smith, A.N.: Web page clustering using a self-organizing map of user navigation patterns. Decision Support Systems, Special issue: Web data mining 35 (2003)

    Google Scholar 

  5. Shahabi, C., Zarkesh, A.M., Adibi, J., Shah, V.: Knowledge discovery from users web-page navigation. In: IEEE Workshop Research Issues in Data Engineering, pp. 20–29

    Google Scholar 

  6. Perkowitz, M., Etzioni, O.: Adaptive web sites: Automatically synthesizing web pages. In: The Fifteenth National Conf. on Artificial Intelligence (AAAI), pp. 727–732

    Google Scholar 

  7. Su, Z., Yang, Q., Zhang, H.J., Xu, X., Hu, Y.H.: Correlation-based document clustering using web logs. In: The 34th Hawaii International Conference On System Sciences(HICSS-34), January 3-6 (2001)

    Google Scholar 

  8. Mobasher, B., Cooley, R., Srivastava, J.: Automatic personalization based on web usage mining. Technical Report, TR99-010, Department of Computer Science, Depaul University (1999)

    Google Scholar 

  9. Nakayama, T., Kato, H., Yamane, Y.: Discovering the gap between web site designers’ expectations and users’ behavior. In: The Ninth Int’l World Wide Web Conference, Amsterdam

    Google Scholar 

  10. Srikant, R., Yang, Y.: Mining web logs to improve website organization. In: WWW (2001)

    Google Scholar 

  11. Rohlf, F.J.: Algorithm 76: Hierarchical clustering using the minimum spanning tree. Computing (1973)

    Google Scholar 

  12. Chu, Y.J., Liu, T.H.: On the shortest arborescence of a directed graph. Science Sinica (1965)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, X., Li, M., Zhao, W., Chen, DY. (2005). Discovering Conceptual Page Hierarchy of a Web Site from User Traversal History. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_64

Download citation

  • DOI: https://doi.org/10.1007/11527503_64

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation