Mining Maximal Frequently Changing Subtree Patterns from XML Documents

Chen, Ling; Bhowmick, Sourav S.; Chia, Liang-Tien

doi:10.1007/978-3-540-30076-2_7

Ling Chen¹⁹,
Sourav S. Bhowmick¹⁹ &
Liang-Tien Chia¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3181))

Included in the following conference series:

International Conference on Data Warehousing and Knowledge Discovery

449 Accesses
10 Citations

Abstract

Due to the dynamic nature of online information, XML documents typically evolve over time. The change of the data values or structures of an XML document may exhibit some particular patterns. In this paper, we focus on the sequence of changes to the structures of an XML document to find out which subtrees in the XML structure frequently change together, which we call Frequently Changing Subtree Patterns (FCSP). In order to keep the discovered patterns more concise, we further define the problem of mining maximal FCSPs. An algorithm derived from the FP-growth is developed to mine the set of maximal FCSPs. Experiment results show that our algorithm is substantially faster than the naive algorithm and it scales well with respect to the size of the XML structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining rooted ordered trees under subtree homeomorphism

Article 19 October 2015

Pattern-Growth Methods

Modified FP-Growth: An Efficient Frequent Pattern Mining Approach from FP-Tree

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. VLDB, pp. 487–499 (1994)
Google Scholar
Braga, D., Campi, A., Ceri, S., Klemettinen, M., Lanzi, P.L.: A tool for extracting xml association rules from xml documents. In: Proc. IEEE ICTAI, pp. 57–65
Google Scholar
Chen, L., Bhowmick, S.S.: Web structural delta association rule mining: Issues, chanllenges and solutions. In: TR, NTU, Singapore, http://www.ntu.edu.sg/home5/pg02322722/TR.html
Chen, L., Bhowmick, S.S., Chia, L.T.: Mining association rules from structural deltas of historical xml documents. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 452–457. Springer, Heidelberg (2004)
Chapter Google Scholar
Cobena, G., Abiteboul, S., Marian, A.: Detecting changes in xml documents. In: Proc. ICDE (2002)
Google Scholar
Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: Xsearch: A semantic search engine for xml. In: Proc. VLDB (2003)
Google Scholar
Han, J.W., Pei, J., Yin, Y.W.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD, pp. 1–12 (2000)
Google Scholar
Zaki, M.J., Aggarwal, C.C.: Xrules: An effective structural classifier for xml data. In: SIGKDD, Washington, DC, USA (2003)
Google Scholar
Lee, M.L., Yang, L.H., Hsu, W., Yang, X.: Xclust: Clustering xml schemas for effective integration. In: ACM 11th CIKM, McLean, VA (2002)
Google Scholar
Bayardo Jr., R.J.: Efficiently mining long patterns from databases. In: Proceedings of the ACM SIGMOD, Seattle, WA (1998)
Google Scholar
Termier, A., Rousset, M.C., Sebag, M.: Treefinder: A first step towards xml data mining. In: Proc. IEEE ICDE, pp. 450–457 (2002)
Google Scholar
Wang, K., Liu, H.: Discovering structural association of semistructured data. In: IEEE TKDE, vol. 12, pp. 353–371 (2000)
Google Scholar
Wang, Y., DeWitt, D.J., Cai, J.Y.: X-diff: An effective change detection algorithm for xml documents. In: Proc. ICDE (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, Nanyang Technological University Singapore, 639798, Singapore
Ling Chen, Sourav S. Bhowmick & Liang-Tien Chia

Authors

Ling Chen
View author publications
You can also search for this author in PubMed Google Scholar
Sourav S. Bhowmick
View author publications
You can also search for this author in PubMed Google Scholar
Liang-Tien Chia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Informatics, Kyoto University, Yoshida-Honmachi, 606-8501, Sakyo, Kyoto, Japan
Yahiko Kambayashi
I.B.M. India Research Lab,, India
Mukesh Mohania
Institute for Application Oriented Knowledge Processing (FAW), Johannes Kepler University Linz, Austria
Wolfram Wöß

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, L., Bhowmick, S.S., Chia, LT. (2004). Mining Maximal Frequently Changing Subtree Patterns from XML Documents. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-30076-2_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Mining Maximal Frequently Changing Subtree Patterns from XML Documents

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining rooted ordered trees under subtree homeomorphism

Pattern-Growth Methods

Modified FP-Growth: An Efficient Frequent Pattern Mining Approach from FP-Tree

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Mining Maximal Frequently Changing Subtree Patterns from XML Documents

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining rooted ordered trees under subtree homeomorphism

Pattern-Growth Methods

Modified FP-Growth: An Efficient Frequent Pattern Mining Approach from FP-Tree

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation