Log in

Combining core points and cluster-level semantic similarity for self-supervised clustering

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Contrastive learning utilizes data augmentation to guide network training. This approach has attracted considerable attention for clustering, object detection, and image segmentation. However, previous studies have ignored the impact of false-negative pairs, resulting in the dissimilarity of the semantic representations of the same cluster. Some researchers have attempted to address this problem; however, only considering the image level has provided unsatisfactory results. To this end, we propose a novel feature extraction algorithm suitable for clustering, combining core points and semantic similarity at the cluster level to restructure positive and negative pairs. Specifically, the core points consisting of the n-nearest neighbors of the cluster center are considered the semantic sample relations of the cluster. This information is explored to reconstruct semantic positive and negative pairs to maximize intra-cluster similarity and inter-cluster variability. More accurate cluster centers offer a sub-optimal initialization for updating the feature model and clustering assignment, which is optimized by the expectation-maximization framework. Extensive experiments conducted on six benchmark datasets show promising clustering performances with relatively few training epochs. The proposed method outperforms the best baseline by 4\(\%\) (1.5\(\%\)) on CIFAR-100 (CIFAR-10). The CPCS code is open-sourced at https://github.com/Cappuccino-Sugar/CPCS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Algorithm 2
Fig. 4
Algorithm 3
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Data availability

The data that support the ndings of this study are availiable at https://github.com/Cappuccino-Sugar/CPCS.

References

  1. Barrera JM, Reina A, Mate A et al (2022) Fault detection and diagnosis for industrial processes based on clustering and autoencoders: a case of gas turbines. Int J Mach Learn Cybern 13(10):3113–3129

    Article  Google Scholar 

  2. Cai S, Qiu L, Chen X et al (2023) Semantic-enhanced image clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 6869–6878

  3. Cao S, Wang W, Zhang J et al (2022) A few-shot fine-grained image classification method leveraging global and local structures. Int J Mach Learn Cybern 13(8):2273–2281

    Article  Google Scholar 

  4. Chang J, Wang L, Meng G et al (2017) Deep adaptive image clustering. In: Proceedings of the IEEE international conference on computer vision, pp 5879–5887. https://doi.org/10.1109/ICCV.2017.626

  5. Chen H, Lagadec B, Bremond F (2021) Ice: inter-instance contrastive encoding for unsupervised person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14960–14969

  6. Chen T, Kornblith S, Norouzi M, et al (2020a) A simple framework for contrastive learning of visual representations. In: International conference on machine learning, PMLR, pp 1597–1607.

    He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  7. He K, Fan H, Wu Y et al (2020) Momentum contrast for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 9729–9738

  8. Huang D, Wang CD, Peng H et al (2018) Enhanced ensemble clustering via fast propagation of cluster-wise similarities. IEEE Trans Syst Man Cybern Syst 51(1):508–520

    Article  Google Scholar 

  9. Huang D, Wang CD, Lai JH et al (2021) Toward multidiversified ensemble clustering of high-dimensional data: from subspaces to metrics and beyond. IEEE Trans Cybern 52(11):12231–12244

    Article  Google Scholar 

  10. Huang J, Gong S, Zhu X (2020) Deep semantic clustering by partition confidence maximisation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 8849–8858. https://doi.org/10.1109/CVPR42600.2020.00887

  11. Huang X, Zhou H, Feng B et al (2023) Graph contrastive learning for skeleton-based action recognition. ar**v preprint ar**v:2301.10900

  12. Jain AK (2008) Data clustering: 50 years beyond k-means. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, Heidelberg, pp 3–4. https://doi.org/10.1007/978-3-540-87479-9_3

  13. Ji X, Henriques JF, Vedaldi A (2019) Invariant information clustering for unsupervised image classification and segmentation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9865–9874. https://doi.org/10.1109/ICCV.2019.00996

  14. Kipf TN, Welling M (2017) Semi-supervised classification with graph convolutional networks. In: 5th international conference on learning representations, ICLR 2017, Toulon, France, April 24–26, 2017, conference track proceedings. OpenReview.net. https://openreview.net/forum?id=SJU4ayYgl

  15. Li Y, Hu P, Liu Z et al (2021) Contrastive clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 8547–8555. https://doi.org/10.48550/ar**v.2009.09687

  16. Lv J, Kang Z, Lu X et al (2021) Pseudo-supervised deep subspace clustering. IEEE Trans Image Process 30:5252–5263

    Article  Google Scholar 

  17. Niu C, Shan H, Wang G (2022) Spice: semantic pseudo-labeling for image clustering. IEEE Trans Image Process 31:7264–7278

    Article  Google Scholar 

  18. Oord Avd, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. https://doi.org/10.48550/ar**v.1807.03748. ar**v preprint ar**v:1807.03748

  19. Pan E, Kang Z (2021) Multi-view contrastive graph clustering. Adv Neural Inf Process Syst 34:2148–2159

    Google Scholar 

  20. Pan E, Kang Z (2023) Beyond homophily: reconstructing structure for graph-agnostic clustering. ar**v preprint ar**v:2305.02931

  21. Peng D, Gui Z, Wang D et al (2022) Clustering by measuring local direction centrality for data with heterogeneous density and weak connectivity. Nat Commun 13(1):5455

    Article  Google Scholar 

  22. Reynolds DA (2009) Gaussian mixture models. Encycl Biometrics 741:659–663

    Article  Google Scholar 

  23. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. ar**v preprint ar**v:1409.1556

  24. Tian F, Gao B, Cui Q et al (2014) Learning deep representations for graph clustering. In: Proceedings of the AAAI conference on artificial intelligence, pp 1293–1299. https://doi.org/10.1609/aaai.v28i1.8916

  25. Tsai TW, Li C, Zhu J (2020) Mice: mixture of contrastive experts for unsupervised image clustering. In: International conference on learning representations. https://openreview.net/forum?id=gV3wdEOGy_V

  26. Van Gansbeke W, Vandenhende S, Georgoulis S et al (2020) Scan: learning to classify images without labels. In: European conference on computer vision. Springer, pp 268–285. https://doi.org/10.1007/978-3-030-58607-2_16

  27. Wang F, Liu H (2021) Understanding the behaviour of contrastive loss. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2495–2504. https://doi.org/10.1109/CVPR46437.2021.00252

  28. **e J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning. PMLR, pp 478–487. https://doi.org/10.48550/ar**v.1511.06335

  29. Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Advances in Neural Information ProcessingSystems, 17. Cambridge, MIT Press, MA, USA, pp 1601–1608

    Google Scholar 

  30. Zhang J, Sun J, Wang J et al (2022) An object tracking framework with recapture based on correlation filters and Siamese networks. Comput Electr Eng 98(107):730

    Google Scholar 

  31. Zhang R, Isola P, Efros AA (2016) Colorful image colorization. In: European conference on computer vision. Springer, pp 649–666. https://doi.org/10.1007/978-3-319-46487-9_40

  32. Zhao H, Yang X, Wang Z et al (2021) Graph debiased contrastive learning with joint representation clustering. In: IJCAI, pp 3434–3440. https://doi.org/10.24963/ijcai.2021/473

  33. Zhong H, Chen C, ** Z et al (2020) Deep robust clustering by contrastive learning. ar**v preprint ar**v:2008.03030

  34. Zhong H, Wu J, Chen C et al (2021) Graph contrastive clustering. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9224–9233. https://doi.org/10.1109/ICCV48922.2021.00909

  35. Zhu R, Zhao B, Liu J et al (2021) Improving contrastive learning by visualizing feature transformation. In: 2021 IEEE/CVF international conference on computer vision (ICCV), pp 10286–10295. https://doi.org/10.1109/ICCV48922.2021.01014

  36. Znalezniak M, Rola P, Kaszuba P et al (2023a) Contrastive hierarchical clustering. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 627–643

  37. Znalezniak M, Rola P, Kaszuba P et al (2023b) Contrastive hierarchical clustering. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, pp 627–643

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bojun **e.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Chen, J., Zhang, X. et al. Combining core points and cluster-level semantic similarity for self-supervised clustering. Int. J. Mach. Learn. & Cyber. 15, 3127–3142 (2024). https://doi.org/10.1007/s13042-023-02084-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-02084-1

Keywords

Navigation