Log in

Semi-supervised sparse representation collaborative clustering of incomplete data

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Sparse subspace clustering (SSC) focuses on revealing the structure and distribution of high dimensional data from an algebraic perspective. It is a two-phase clustering technique, performing sparse representation of the high dimensional data and subsequently cutting the induced affinity graph, which cannot achieve an optimal or expected clustering result. To address this challenge, this paper proposes an approach to subspace representation collaborative clustering (SRCC) for incomplete high dimensional data. In the proposed model, both phases of sparse subspace representation and clustering are integrated into a unified optimization, in which a fuzzy partition matrix is introduced as a bridge to cluster the extracted sparse representation features of the data. At the same time, the missing entries are adaptively imputed along with the two phases. To generalize SRCC to a semi-supervised case, an adjacency matrix of incomplete data is constructed with the ideas of ‘Must-link’ and ‘Cannot-link’. Meanwhile, a semi-supervised indicator matrix is introduced to promote discriminative capacity of revealing global and local structures of incomplete data and enhance the performance of clustering. The semi-supervised sparse representation collaborative clustering (S3RCC) is modeled. Extensive experiments on lots of real-world benchmark datasets demonstrate the superior performance of the proposed two models on imputation and clustering of incomplete data compared to the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Algorithm 2
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The Yale face dataset: http://cvc.cs.yale.edu/cvc/projects/yalefaces/yalefaces.html

  2. The Extended Yale B (EYB) face dataset: http://www.cad.zju.edu.cn/home/dengcai/Data/FaceData.html

  3. The JAFFE face dataset: http://www.kasrl.org/jaffe.html

  4. The AR face dataset: http://www2.ece.ohio-state.edu/aleix/ARdatabase.html

  5. The COIL20 object dataset: https://www.cs.columbia.edu/CAVE/software/softlib/coil-20.php

  6. The COIL100 object dataset: https://cave.cs.columbia.edu/repository/COIL-100

  7. The Riccardo dataset: https://www.openml.org/search?type=data &status=active &tags.tag=chalearn &id=41161 &sort=runs

  8. The Tomlins-2006-v1 dataset: https://schlieplab.org/Static/Supplements/CompCancer/datasets.htm

  9. The Su-2001 dataset: https://schlieplab.org/Static/Supplements/CompCancer/datasets.htm

References

  1. Zhang X, Xue X, Sun H, Liu Z, Guo L, Guo X (2021) Robust multiple kernel subspace clustering with block diagonal representation and low-rank consensus kernel. Knowl-Based Syst 227:107243

    Article  Google Scholar 

  2. Chen J, Mao H, Wang Z, Zhang X (2021) Low-rank representation with adaptive dictionary learning for subspace clustering. Knowl-Based Syst 223(13):107053

    Article  Google Scholar 

  3. Deng T, Ye D, Ma M, Fujita H, **ong L (2020) Low-rank local tangent space embedding for subspace clustering. Inform Sci 508:1–21

    Article  MathSciNet  Google Scholar 

  4. Vivekanandan K, Praveena N (2021) Hybrid convolutional neural network (CNN) and long-short term memory (LSTM) based deep learning model for detecting shilling attack in the social-aware network. J Ambient Intell Humanized Comput 12(1):1197–1210

    Article  Google Scholar 

  5. Tepe C, Demir M (2022) Real-Time classification of EMG Myo armband data using support vector machine. IRBM 43(4):300–308

    Article  Google Scholar 

  6. Munusamy S, Murugesan P (2020) Modified dynamic fuzzy C-means clustering algorithm-Application in dynamic customer segmentation. Appl Intell 50(6):1922–1942

    Article  Google Scholar 

  7. Hu J, Yin H, Wei G, Song Y (2022) An improved FCM clustering algorithm with adaptive weights based on PSO-TVAC algorithm. Appl Intell 52(8):9521–9536

  8. Barath D, Matas J (2022) Graph-cut RANSAC: Local optimization on spatially coherent structures. IEEE Trans Pattern Anal Mach Intell 44(9):4961–4974

  9. Ren Z, Sun Q, Wu B, Zhang X, Yan W (2020) Learning latent low-rank and sparse embedding for robust image feature extraction. IEEE Trans Image Process 29:2094–2107

  10. Ren Z, Sun Q (2021) Simultaneous global and local graph structure preserving for multiple kernel clustering. IEEE Trans Neural Netw Learn Syst 32(5):1839–1851

  11. Wei L, Ji F, Liu H, Zhou R, Zhu C, Zhang X (2022) Subspace clustering via structured sparse relation representation. IEEE Trans Neural Netw Learn Syst 33(9):4610–4623

  12. Menon V, Muthukrishnan G, Kalyani S (2020) Subspace clustering without knowing the number of clusters: A parameter free approach. IEEE Trans Signal Process 68:5047–5062

  13. Zeng S, Duan X, Li H, Bai J, Tang Y, Wang Z (2023) A sparse framework for robust possibilistic \(k\)-subspace clustering. IEEE Trans Fuzzy Syst 31(4):1124–1138

  14. De Ford D, Pauls S (2019) Spectral clustering methods for multiplex networks. Phys A: Stat Mech Appl 553:121949

    Google Scholar 

  15. Elhamifar E, Vidal R (2013) Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781

  16. Sun W, Peng J, Yang G, Du Q (2020) Fast and latent low-rank subspace clustering for hyperspectral band selection. IEEE Trans Geosci Remote Sens 58(6):3906–3915

  17. Fu Z, Zhao Y, Chang D, Wang Y (2020) A hierarchical weighted low-rank representation for image clustering and classification. Pattern Recog 112(7):107736

    Google Scholar 

  18. You C, Palade V, Wu X (2019) Robust structure low-rank representation in latent space. Eng Appl Artif Intell 77:117–124

    Article  Google Scholar 

  19. Yang C, Robinson D, Vidal R (2015) Sparse subspace clustering with missing entries. In: Proceedings of the 32nd international conference on machine learning, pp 2463–2472

  20. Bhojanapalli S, Jain P (2014) Universal matrix completion. In: Proceedings of the 31st international conference on machine learning, pp 1881–1889

  21. Fan J, Chow T (2017) Sparse subspace clustering for data with missing entries and high-rank matrix completion. Neural Netw 93:36–44

    Article  Google Scholar 

  22. Sefidian A, Daneshpour N (2019) Missing value imputation using a novel grey based fuzzy C-means, mutual information based feature selection, and regression model. Expert Syst Appl 115:68–94

    Article  Google Scholar 

  23. Xu Z, Liu Y, Li C (2021) Distributed semi-supervised learning with missing data. IEEE Trans Cybern 51(12):6165–6178

  24. Wang L, Chan R, Zeng T (2021) Probabilistic semi-supervised learning via sparse graph structure learning. IEEE Trans Neural Netw Learn Syst 32(2):853–867

  25. Li S, Li W, Hu J, Li Y (2022) Semi-supervised bi-orthogonal constraints dual-graph regularized NMF for subspace clustering. Appl Intell 52(3):3227–3248

    Article  Google Scholar 

  26. Wang Z, Wang S, Bai L, Wang W, Shao Y (2022) Semi-supervised fuzzy clustering with fuzzy pairwise constraints. IEEE Trans Fuzzy Syst 30(9):3797–3811

  27. Mey A, Loog M (2023) Improved generalization in semi-supervised learning: A survey of theoretical results. IEEE Trans Pattern Anal Mach Intell 45(4):4747–4767

  28. Gan H, Yang Z, Zhou R (2023) Adaptive safety-aware semi-supervised clustering. Expert Syst Appl 212:118751

    Article  Google Scholar 

  29. Fang X, Xu Y, Li X, Lai Z, Wong W (2016) Robust semi-supervised subspace clustering via non-negative low-rank representation. IEEE Trans Cybern 46(8):1828–1838

  30. Wang W, Yang C, Chen H, Feng X (2018) Unified discriminative and coherent semi-supervised subspace clustering. IEEE Trans Image Process 27(5):2461–2470

  31. Peng S, Ser W, Chen B, Lin Z (2021) Robust semi-supervised nonnegative matrix factorization for image clustering. Pattern Recog 111(3):107683

    Article  Google Scholar 

  32. **g X, Yan Z, Shen Y, Pedrycz W, Yang J (2022) A group-based distance learning method for semi-supervised fuzzy clustering. IEEE Trans Cybern 52(5):3083–3096

  33. Wen J, Zhang Z, Zhang Z, Fei L, Wang M (2021) Generalized incomplete multi-view clustering with flexible locality structure diffusion. IEEE Trans Cybern 51(1):101–114

  34. Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structure by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184

  35. **ng Z, Wen M, Peng J, Feng J (2021) Discriminative semi-supervised non-negative matrix factorization for data clustering. Eng Appl Artif Intell 103(1):104289

    Article  Google Scholar 

  36. Hathaway R, Bezdek J (2001) Fuzzy C-means clustering of incomplete data. IEEE Trans Syst Man Cybern 31(5):735–744

  37. Guo Z, Han J, Gong X, Liu L, Zhou R, Wu Y (2022) ADMM-based method for estimating magnetotelluric impedance in the time domain. IEEE Trans Geosci Remote Sens 60:1–16

  38. Fu L, Yang J, Chen C, Zhang C (2022) Low-rank tensor approximation with local structure for multi-view intrinsic subspace clustering. Inform Sci 606:877–891

    Article  Google Scholar 

  39. Pedrycz W, Amato A, Lecce V, Piuri V (2008) Fuzzy clustering with partial supervision in organization and classification of digital images. IEEE Trans Fuzzy Syst 16(4):1008–1026

  40. Salehi F, Keyvanpour M, Sharif A (2021) SMKFC-ER: Semi-supervised multiple kernel fuzzy clustering based on entropy and relative entropy. Inform Sci 547:667–688

    Article  MathSciNet  Google Scholar 

  41. Wang J, Yang Z, Liu X, Li B, Yi J, Nie F (2022) Projected fuzzy C-means with probabilistic neighbors. Inform Sci 607:553–571

    Article  Google Scholar 

Download references

Acknowledgements

This paper was supported by the National Natural Science Foundation of China with grant 12171115.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tingquan Deng.

Ethics declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, T., Wang, J., Jia, Q. et al. Semi-supervised sparse representation collaborative clustering of incomplete data. Appl Intell 53, 31077–31105 (2023). https://doi.org/10.1007/s10489-023-05168-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-023-05168-1

Keywords

Navigation