Abstract
Due to advances in information technology, data collection is becoming much easier. Clustering is an important technique for exploring data structures used in many fields, such as customer segmentation, image recognition, social science, and so on. However, in real-world applications, there are a lot of noises or outliers which will seriously influence the clustering performance in the dataset. Besides, the clustering results are susceptible to the initial centroids and algorithm parameters. To overcome the influence of outliers on clustering results, this study combines the advantages of probability c-means and fuzzy c-ordered means to propose a fuzzy possibilistic c-ordered means (FPCOM) algorithm. In order to solve the problem of parameters and initial centroids determination, this study employs a sine cosine algorithm (SCA) combined with FPCOM to improve the clustering results. The proposed algorithm is named SCA-FPCOM algorithm. Ten benchmark datasets collected from the UCI machine repository were used to validate the proposed algorithm in terms of adjusted rand index and the Silhouette coefficient. According to the experimental results, the SCA-FPCOM algorithm can obtain better results than other algorithms.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00500-020-05380-y/MediaObjects/500_2020_5380_Fig1_HTML.png)
Similar content being viewed by others
References
Abd Elaziz M, Nabil N, Ewees AA, Lu S (2019) Automatic data clustering based on hybrid atom search optimization and sine-cosine algorithm. In: 2019 IEEE congress on evolutionary computation (CEC). IEEE, pp 2315–2322
Allahyari M, Pouriyeh S, Assefi M, Safaei S, Trippe ED, Gutierrez JB, Kochut KJ (2017) A brief survey of text mining: classification, clustering and extraction techniques. ar**v:1707.02919
Bala M (2017) Sine cosine based algorithm for data clustering. Int J Fut Revol Comput Sci Commun Eng 3(11):568–572
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Kluwer Academic Publishers, Dordrecht
Bezdek JC, Boggavarapu S, Hall LO, Bensaid A (1994) Genetic algorithm guided clustering. In: Proceedings of the first IEEE conference on evolutionary computation. IEEE world congress on computational intelligence. IEEE, pp 34–39
Chen C-Y, Feng H-M, Ye F (2006) Automatic particle swarm optimization clustering algorithm. Int J Electr Eng 13:379–387
Davé RN, Krishnapuram R (1997) Robust clustering methods: a unified view. IEEE Trans Fuzzy Syst 5:270–293
Djenouri Y, Belhadi A, Fournier-Viger P, Lin JCW (2018) Fast and effective cluster-based information retrieval using frequent closed itemsets. Inf Sci 453:154–167
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43
Fan J, Han M, Wang JJPR (2009) Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation. Pattern Recognit 42:2527–2540
Farhang Y (2017) Face extraction from image based on k-means clustering algorithms. Int J Adv Comput Sci Appl 8:96–107
Garces E, Munoz A, Lopez-Moreno J, Gutierrez D (2012) Intrinsic images by clustering. In: Computer graphics forum, vol 4. Wiley, New York, pp 1415–1424
Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3:95–99
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Hubert L, Arabie PJ (1985) Comparing partitions. J Classif 2:193–218
Jacob E, Sasikumar R, Nair KRJB (2004) A fuzzy guided genetic algorithm for operon prediction. Bioinformatics 21:1403–1407
Jiang B, Wang N, Wang LJ (2014) Parameter identification for solid oxide fuel cells using cooperative barebone particle swarm optimization with hybrid learning. Int J Hydrog Energy 39:532–542
Jimenez J, Cuevas F, Carpio J (2007) Genetic algorithms applied to clustering problem and data mining. In: Proceedings of the 7th WSEAS international conference on simulation, modelling and optimization,. World Scientific and Engineering Academy and Society (WSEAS), pp 219–224
Krishna K, Murty NM (1999) Genetic K-means algorithm. IEEE Trans Syst Man Cybern Part B 29:433–439
Krishnapuram R, Keller JMJ (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Syst 1:98–110
Kumar V, Kumar D (2017) Data clustering using sine cosine algorithm: data clustering using SCA. In: Handbook of research on machine learning innovations and trends. IGI Global, pp 715–726
Kuo R, Nguyen TPQ (2019) Genetic intuitionistic weighted fuzzy k-modes algorithm for categorical data. Neurocomputing 330:116–126
Leski JM (2016) Fuzzy c-ordered-means clustering. Fuzzy Sets Syst 286:114–133
Łęski J (2003) Towards a robust fuzzy clustering. Fuzzy Sets Syst 137:215–233
Lin H-J, Yang F-W, Kao Y-TJ (2005) An efficient GA-based clustering technique. Tamkang J Sci Eng 8:113–122
Lu Y, Lu S, Fotouhi F, Deng Y, Brown SJ (2004) FGKA: a fast genetic k-means clustering algorithm. In: Proceedings of the 2004 ACM symposium on Applied computing. ACM, pp 622–623
Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recognit 33:1455–1465
Mirjalili S (2016) SCA: a sine cosine algorithm for solving optimization problems. Knowl-based Syst 96:120–133
Mukherjee S, Dutta A (2017) A comparative analysis of clustering algorithms and recent developments. Int J Adv Res Comput Sci 8:204–244
Nguyen TPQ, Kuo R (2019a) Automatic fuzzy clustering using non-dominated sorting particle swarm optimization algorithm for categorical data. IEEE Access 7:99721–99734
Nguyen TPQ, Kuo R (2019b) Partition-and-merge based fuzzy genetic clustering algorithm for categorical data. Appl Soft Comput 75:254–264
Nicholls T, Bright JJCM (2019) Measures. Understanding news story chains using information retrieval and network clustering techniques. 13:43–59
Omran MG, Salman A, Engelbrecht APJPA (2006) Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Anal Appl 8:332
Osman IH, Kelly JP (1997) Meta-heuristics theory and applications. J Oper Res Soc 48:657
Pal NR, Pal K, Keller JM, Bezdek JC (2005) A possibilistic fuzzy c-means clustering algorithm. IEEE Trans Fuzzy 13:517–530
Pedrycz W, Rai PJFS (2008) Collaborative clustering with the use of fuzzy C-means and its quantification. Fuzzy Sets Syst 159:2399–2427
Pizzuti C, Procopio N (2016) A k-means based genetic algorithm for data clustering. In: International joint conference SOCO’16-CISIS’16-ICEUTE’16. Springer, pp 211–222
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66:846–850
Rousseeuw PJ (1987) Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 20:53–65
Saxena A et al (2017) A review of clustering techniques and developments. Neurocomputing 267:664–681
Scheidler A, Merkle D, Middendorf M (2013) Swarm controlled emergence for ant clustering. Int J Intell Comput Cybern 6:62–82. https://doi.org/10.1108/17563781311301526
Shi Y, Eberhart RC (1998) Parameter selection in particle swarm optimization. In: International conference on evolutionary programming. Springer, Berlin, pp 591–600
Van der Merwe D, Engelbrecht AP (2003) Data clustering using particle swarm optimization. In: The 2003 congress on evolutionary computation. CEC’03, 2003. IEEE, pp 215–220
Vinh NX, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Wen L, Zhou K, Yang S (2019) A shape-based clustering method for pattern recognition of residential electricity consumption. J Cleaner Prod 212:475–488
Xu R, Wunsch DC II (2008) Recent advances in cluster analysis. Int J Intell Comput Cybern 1:484
Yang M-S, Wu K-L (2004) A similarity-based robust clustering method. IEEE Trans Pattern Anal Mach Intell 26:434–448
Zadeh LA (1965) Fuzzy sets. Inf Control 8:338–353
Zhao L, Shi G (2019) A trajectory clustering method based on Douglas-Peucker compression and density for marine traffic pattern recognition. Ocean Eng 172:456–467
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.
Human and animals rights
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kuo, R.J., Lin, JY. & Nguyen, T.P.Q. An application of sine cosine algorithm-based fuzzy possibilistic c-ordered means algorithm to cluster analysis. Soft Comput 25, 3469–3484 (2021). https://doi.org/10.1007/s00500-020-05380-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05380-y