Abstract
Clustering high-dimensional data is challenging for traditional clustering methods. Spectral clustering is one of the most popular methods to cluster high-dimensional data, in which the similarity matrix plays an important role. Recently, sparse representation coefficients have been proposed to construct the similarity matrix via the cosine similarity between each pair of coefficient vectors for spectral clustering and showed promising results. However, the sparse representation emphasizes too much on the role of \( \ell_{1} \)-norm sparsity and ignores the role of collaborative representation, which makes its computational cost very high. In this paper, we propose a spectral clustering method based on the similarity matrix which is constructed based on the collaborative representation coefficient vectors. Extensive experiments show that the proposed method has a strong competitiveness both in terms of computational cost and clustering performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Steinbach, M., Ertöz, L., Kumar, V.: The challenges of clustering high dimensional data. In: Wille, L.T (ed.), New Directions in Statistical Physics, pp. 273–309. Springer, Heidelberg (2004)
Cai, D.: Litekmeans: the fastest matlab implementation of kmeans (2011). http://www.zjucadcg.cn/dengcai/Data/Clustering.html
Phoungphol, P., Zhang, Y.: Multi-source kernel k-means for clustering heterogeneous biomedical data. In: IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW) 2011, pp. 223–228 (2011)
Qin, G., Gao, L.: Spectral clustering for detecting protein complexes in protein–protein interaction (PPI) networks. Math. Comput. Model. 52, 2066–2074 (2010)
Peng, X., Zhang, L., Yi, Z.: An Out-of-sample Extension of Sparse Subspace Clustering and Low Rank Representation for Clustering Large Scale Data Sets. ar**v preprint ar**v:1309.6487 (2013)
Wright, J., Ma, Y., Mairal, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse representation for computer vision and pattern recognition. Proc. IEEE 98, 1031–1044 (2010)
Wu, S., Feng, X., Zhou, W.: Spectral clustering of high-dimensional data exploiting sparse representation vectors. Neurocomputing 135, 229–239 (2014)
Zhang, D., Yang, M., Feng, X.: Sparse representation or collaborative representation: Which helps face recognition? In: IEEE International Conference on Computer Vision (ICCV) 2011, pp. 471–478 (2011)
Donoho, D.L.: Compressed sensing. IEEE Trans. Inf. Theor. 52, 1289–1306 (2006)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B (Methodol.) 58(1), 267–288 (1996)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2, 849–856 (2002)
Yeoh, E.-J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2), 133–143 (2002)
Alon, U., Barkai, N., Notterman, D.A., Gish, K., Ybarra, S., Mack, D., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. 96, 6745–6750 (1999)
Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar, R.C., et al.: Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat. Med. 8, 68–74 (2002)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., et al.: MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 30, 41–47 (2001)
Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., et al.: Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679 (2001)
Acknowledgement
This work is supported by the National Natural Science Foundation of China under (Grant no. 61474267, 60973153 and 61471169) and Collaboration and Innovation Center for Digital Chinese Medicine of 2011 Project of Colleges and Universities in Hunan Province.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, S., Gu, J., Chen, F. (2015). Clustering High-Dimensional Data via Spectral Clustering Using Collaborative Representation Coefficients. In: Huang, DS., Jo, KH., Hussain, A. (eds) Intelligent Computing Theories and Methodologies. ICIC 2015. Lecture Notes in Computer Science(), vol 9226. Springer, Cham. https://doi.org/10.1007/978-3-319-22186-1_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-22186-1_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22185-4
Online ISBN: 978-3-319-22186-1
eBook Packages: Computer ScienceComputer Science (R0)