Abstract
With the rapid growth of computational biology and e-commerce applications, high-dimensional data becomes very common. Thus, mining high-dimensional data is an urgent problem of great practical importance. However, there are some unique challenges for mining data of high dimensions, including (1) the curse of dimensionality and more crucial (2) the meaningfulness of the similarity measure in the high dimension space. In this chapter, we present several state-of-art techniques for analyzing high-dimensional data, e.g., frequent pattern mining, clustering, and classification. We will discuss how these methods deal with the challenges of high dimensionality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal R., Gehrke J., Gunopulos D., Raghavan P.: ”Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”, Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998, pp. 94–105.
Agrawal R., and Srikant R., Fast Algorithms for Mining Association Rules in Large Databases. In Proc. of the 20th VLDB Conf., pages 487–499, 1994.
Beyer K.S., Goldstein J., Ramakrishnan R. and Shaft U.: ”When Is ‘Nearest Neighbor’ Meaningful?”, Proceedings 7th International Conference on Database Theory (ICDT’99), pp. 217–235, Jerusalem, Israel, 1999.
Cheng Y., and Church, G., Biclustering of expression data. In Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. San Diego, CA, August 2000.
Cong G., Tung Anthony K. H., Xu X., Pan F., and Yang J., Farmer: Finding interesting rule groups in microarray datasets. In the 23rd ACM SIGMOD International Conference on Management of Data, 2004.
Liu B., Ma Y., Wong C. K., Improving an Association Rule Based Classifier, Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, p.504–509, September 13–16, 2000.
Mitchell T., Machine Learning. WCB McGraw Hill, 1997.
Pan F, Cong G., Tung A. K. H., Yang J., and Zaki M. J., CARPENTER: finding closed patterns in long biological data sets. Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003.
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Discovering frequent closed itemsets for association rules. In Beeri, C., Buneman, P., eds., Proc. of the 7th Int’l Conf. on Database Theory (ICDT’99), Jerusalem, Israel, Volume 1540 of Lecture Notes in Computer Science., pp. 398–416, Springer-Verlag, January 1999.
Pei, J., Han, J., and Mao, R., CLOSET: an efficient Algorithm for mining frequent closed itemsets. In D. Gunopulos and R. Rastogi, eds., ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp 21–30, 2000.
Vapnik, V.N., Statistical Learning Theory. John Wiley and Sons, 1998.
Wang H., Wang W., Yang J. and Yu P., Clustering by pattern similarity in large data sets. Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD), pp. 394–405, 2002.
Yin X., Han J., CPAR: classification based on predictive association rules. Proceedings of SIAM International Conference on Data Mining, San Fransisco, CA, pp. 331–335, 2003.
Zaki M. J. and Hsiao C., CHARM: An efficient algorithm for closed itemset mining. In Proceedings of the Second SIAM International Conference on Data Mining, Arlington, VA, 2002. SIAM
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science+Business Media, Inc.
About this chapter
Cite this chapter
Wang, W., Yang, J. (2005). Mining High-Dimensional Data. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_37
Download citation
DOI: https://doi.org/10.1007/0-387-25465-X_37
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-24435-8
Online ISBN: 978-0-387-25465-4
eBook Packages: Computer ScienceComputer Science (R0)