Log in

Semi-supervised clustering ensemble based on genetic algorithm model

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Clustering ensemble can be regarded as a mathematical optimization problem, and the genetic algorithm has been widely used as a powerful tool for solving such optimization problems. However, the existing research on clustering ensemble based on the genetic algorithm model has mainly focused on unsupervised approaches and has been limited by parameters like crossover probability and mutation probability. This paper presents a semi-supervised clustering ensemble based on the genetic algorithm model. This approach utilizes pairwise constraint information to strengthen the crossover process and mutation process, resulting in enhanced overall algorithm performance. To validate the effectiveness of the proposed approach, extensive comparative experiments were conducted on 9 diverse datasets. The results of the experiments demonstrate the superiority of the proposed algorithm in terms of clustering accuracy and robustness. In summary, this paper introduces a novel semi-supervised approach based on the genetic algorithm model. The utilization of pair-wise constraint information enhances the algorithm’s performance, making it a promising solution for real-world clustering problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Algorithm 1
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Yu SX, Shi J (2003) Multiclass spectral clustering. In: IEEE international conference on computer vision

  2. Jain AK (2010) Data clustering: 50 years beyond K-means[J]. Pattern Recogn Lett 31(8):651–666

  3. Strehl A, Ghosh, J (2003) Cluster ensembles – a knowledge reuse framework for combining multiple partitions. [J]. J Mach Learn Res (3):583–617

  4. Li FJ, Qian YH, Wang JT, Liang JY (2015) Multigranulation information fusion: A dempster-shafer evidence theory based clustering ensemble method. In: International conference on machine learning cybernetics

  5. Alexander Topchy WP, Jain AK (2004) A mixture model of clustering ensembles

  6. Minaei-Bidgoli B, Parvin H, Alinejad-Rokny H, Alizadeh H, Punch WF (2014) 2.02. effects of resampling method and adaptation on clustering ensemble efficacy. Artif Intell Rev 41(1)

  7. Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature

  8. Zhang X, Jiao L, Liu F, Bo L, Gong M (2008) Spectral clustering ensemble applied to sar image segmentation. Geosci Remote Sens IEEE Tran 46(7):2126–2136

    Article  Google Scholar 

  9. Vega-Pons S, Ruiz-Shulcloper, (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25(3):337–372

  10. Abdala DD, Jiang X (2014) Sopd – a new consensus function for the ensemble clustering problem. In: Chilean computer science society (SCCC), 2012 31st International Conference of the

  11. Ching-Shih Deb K, Pratap A, Agarwal S, Meyarivan T (2002) a fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans Evol Comput 6(2):182–197

    Article  Google Scholar 

  12. Azadeh A, Saberi M, Anvari M, Mohamadi M (2011) An integrated artificial neural network-genetic algorithm clustering ensemble for performance assessment of decision making units. J Intell Manuf 22(2):229–245

    Article  Google Scholar 

  13. Singh V, Mukherjee L, Peng J, Xu J (2010) Ensemble clustering using semidefinite programming with applications. Mach Learn 79(1–2):177–200

    Article  MathSciNet  Google Scholar 

  14. Fred A, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850

  15. Abedallah L, Shimshoni I (2012) k nearest neighbor using ensemble clustering. In: Proceedings of the 14th international conference on data warehousing and knowledge discovery

  16. Karypis G, Kumar V (2010) Metis - unstructured graph partitioning and sparse matrix ordering system, version 2.0. technical report. Appl Phys Lett 97(12), id. 124101 (3 pages)

  17. He S, Huang J, He X (2020) Collective neurodynamic optimization for image segmentation by binary model with constraints. Cogn Comput 12(6):1–11

    Article  Google Scholar 

  18. Shanmugam K, Haralick RM (2010) A computationally simple procedure for imagery data compression by the karhunen-love method. IEEE Trans Syst Man Cybern SMC- 3(2):202–204

    Article  Google Scholar 

  19. Jenssen R (2013) Entropy-relevant dimensions in the kernel feature space: Cluster-capturing dimensionality reduction. Signal Process Mag IEEE 30(4):30–39

    Article  Google Scholar 

  20. Otar BC, Akyuz S (2017) Ensemble clustering selection by optimization of accuracy-diversity trade off. In: Signal processing and communications applications conference, pp 1–4

  21. Paledi U, Allahkarami E, Rezai B, Aslani MR (2021) Selectivity index and separation efficiency prediction in industrial magnetic separation process using a hybrid neural genetic algorithm. SN Appl Sci 3(3)

  22. Yu Z, Luo P, You J, Wong HS, Leung H, Wu S, Zhang J, Han G (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714

    Article  Google Scholar 

  23. Yang F, Li T, Zhou Q, **ao H (2017) Cluster ensemble selection with constraints. Neurocomputing 235(APR.26):59–70

  24. Wagstaff K, Cardie C, Rogers S, Schrdl S (2001) Constrained k-means clustering with background knowledge. In: Eighteenth international conference on machine learning

  25. Yang W, Zhang Y, Wang H, Deng P, Li, (2021) Hybrid genetic model for clustering ensemble. Knowl-Based Syst 231(4):107457

  26. Lui KC, Fung YH, Chan YH (2020) Predictive study of ultra-low emissions from dual-fuel engine using artificial neural networks combined with genetic algorithm. Adv Mat Res 105047

  27. Bache K, Lichman M (2013) Uci machine learning repository

  28. Tao M, Hua XS, Wei L, Yang L, Yuan L (2007) Msra-ustc-sjtu at trecvid 2007: High-level feature extraction and search. In: Trecvid workshop participants notebook papers

  29. Huang D, Wang CD, Lai JH, Kwoh CK (2021) Matlab source code for multi-diversified ensemble clustering (mdec) (ieee tcyb 2021)

  30. Cao S (2016) Deep neural networks for learning graph representations. In: Thirtieth Aaai conference on artificial intelligence

  31. Huang D, Wang CD, Wu JS, Lai JH, Kwoh CK (2020) Matlab source code for ultra-scalable spectral clustering and ensemble clustering (u-spec and u-senc) (tkde 2020)

  32. Zhang Ding SL, Yang Y (2022) Weighted semi-supervised clustering ensemble algorithm based on extended constraint projection. J Nan**g Univ 58(4):570

    Google Scholar 

  33. Leung H, You J, Wong HS, Si Z, Luo P (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714

    Article  Google Scholar 

  34. Yan S, Wang H, Li T, Chu J, Guo J (2020) Semi-supervised density peaks clustering based on constraint projection. Int J Comput Intell Syst

  35. Hu J, Li T, Luo C, Fujita H, Yang Y (2017) Incremental fuzzy cluster ensemble learning based on rough set theory. Knowl-Based Syst 132(sep.15):144–15

  36. Yang Z, Oja E (2010) Linear and nonlinear projective nonnegative matrix factorization. IEEE Trans Neural Netw 21(5):734–749

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (11961010, 61967004).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to **angli Li.

Ethics declarations

Conflicts of interest

The authors declare no competing interests.

Ethical Approval

This article does not contain any studies with human or animal subjects performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bi, S., Li, X. Semi-supervised clustering ensemble based on genetic algorithm model. Multimed Tools Appl 83, 55851–55865 (2024). https://doi.org/10.1007/s11042-023-17662-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17662-2

Keywords

Navigation