Abstract
Identifying cohesion of genes for subtypes of diseases in a high-dimensional gene expression database is a highly challenging problem, since the subtypes are based on slight intensity differences between gene expressions. The existed clustering methods are biased with training dataset in identifying the subtypes, and the methods have received irrelevant subtypes. Therefore, this paper introduces unsupervised way of fuzzy clustering to identify the subtypes of genes in a breast cancer database. Here, we have used the dataset which contains 12,634 genes and 288 for finding three available subclasses. In order to cluster the similar intensity genes in the breast cancer dataset, this paper incorporates possibilistic approach, intuitionistic fuzzy sets, and kernel functions with proposed fuzzy clustering techniques. The experimental part of this paper shows that the proposed clustering method how notably identifies the similar gene patterns for common subtypes of breast cancer.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig6_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig7_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig8_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig9_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig10_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig11_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig12_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig13_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig14_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig15_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig16_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig17_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig18_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig19_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig20_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig21_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig22_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig23_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig24_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig25_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig26_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41688-020-00039-x/MediaObjects/41688_2020_39_Fig27_HTML.png)
Similar content being viewed by others
References
F. Bray, J. Ferlay, I. Soerjomataram, R.L. Siegel, L.A. Torre, A. Jemal, Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018). https://doi.org/10.3322/caac.21492
American Cancer Society, Cancer Facts & Figures, Atlanta, 2019.
K.J. Martin, E. Graner, Y. Li, L.M. Price, B.M. Kritzman, M.V. Fournier, et al., High-sensitivity array analysis of gene expression for the early detection of disseminated breast tumor cells in peripheral blood. Proc. Natl. Acad. Sci. 98(5), 2646–2651 (2001). https://doi.org/10.1073/pnas.041622398
B.C. Patel, G.R. Sinha, An adaptive K-means clustering algorithm for breast image segmentation. Int. J. Comput. Appl. 10(4), 35–38 (2010). https://doi.org/10.5120/1467-1982
Sheshadri, H. S. and Kandaswamy, A, Computer aided decision system for early detection of breast cancer. Indian J. Med. Res. 124(2), 149–154 (2006).
Agrawal, U. et al., Combining clustering and classification ensembles: A novel pipeline to identify breast cancer profiles. Artificial Intelligence in Medicine. 97, 27–37 (2019).
V. Chaurasia, S. Pal, B. Tiwari, Prediction of benign and malignant breast cancer using data mining techniques. J Algorith Comput Technol 12(2), 119–126 (2018). https://doi.org/10.1177/1748301818756225
C. Kothari, G. Ouellette, Y. Labrie, S. Jacob, C. Diorio, F. Durocher, Identification of a gene signature for different stages of breast cancer development that could be used for early diagnosis and specific therapy. Oncotarget 9(100) (2018). https://doi.org/10.18632/oncotarget.26448
A. Sanjay, H.V. Nair, S. Murali, K.S. Krishnaveni, in 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI). A data mining model to predict breast cancer using improved feature selection method on real time data (2018), pp. 2437–2440. https://doi.org/10.1109/ICACCI.2018.8554450
Doostparast Torshizi,A and Fazel Zarandi, M. H, Alpha-plane based automatic general type-2 fuzzy clustering based on simulated annealing meta-heuristic algorithm for analyzing gene expression data. Comput. Biol. Med., 64, 347–359 (2015).
Lance Parsons et al, Subspace Clustering for High Dimensional Data: A Review. ACM SIGKDD Explorations Newsletter. 6(1), (2004).
S. Aalaei, H. Shahraki, A. Rowhanimanesh, S. Eslami, Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets. Iran. J Basic Med. Sci. 19(5), 7 (2016)
E. Aličković, A. Subasi, Breast cancer diagnosis using GA feature selection and rotation Forest. Neural Comput. & Applic. 28(4), 753–763 (2017). https://doi.org/10.1007/s00521-015-2103-9
Devi, R. D. H., & Devi, D. M. I, Outlier detection algorithm combined with decision tree classifier for early diagnosis of breast cancer, International Journal of Advanced Engineering Technology, VII/Issue I, pp. 93-98 (2016).
P. Velusamy, P. Karantharaj, S. Prabakar, New scheme for breast cancer detection and staging using ant colony algorithm. Int. J. Biomed. Eng. Technol. 27(1/2), 86 (2018). https://doi.org/10.1504/IJBET.2018.093088
H. Zamani, M.-H. Nadimi-Shahraki, Swarm intelligence approach for breast cancer diagnosis. Int. J. Comput. Appl. 151(1), 40–44 (2016). https://doi.org/10.5120/ijca2016911667
Z. Yu, P. Luo, J. You, H.-S. Wong, H. Leung, S. Wu, G. Han, Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans. Knowl. Data Eng. 28(3), 701–714 (2016). https://doi.org/10.1109/TKDE.2015.2499200
L. Zheng, T. Li, in 2011 IEEE 11th International Conference on Data Mining. Semi-supervised hierarchical clustering (2011), pp. 982–991. https://doi.org/10.1109/ICDM.2011.130
S.M. Bohte, H. La Poutre, J.N. Kok, Unsupervised clustering with spiking neurons by sparse temporal coding and multilayer RBF networks. IEEE Trans. Neural Netw. 13(2), 426–435 (2002). https://doi.org/10.1109/72.991428
E. Elhamifar, R. Vidal, Sparse subspace clustering: Algorithm, theory, and applications. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2765–2781 (2013). https://doi.org/10.1109/TPAMI.2013.57
A. McCallum et al., Efficient clustering of high-dimensional data sets with application to reference matching (KDD'00 : Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000), pp. 169–178
H.M. Moftah, A.T. Azar, E.T. Al-Shammari, N.I. Ghali, A.E. Hassanien, M. Shoman, Adaptive K-means clustering algorithm for MR breast image segmentation. Neural Comput. & Applic. 24(7–8), 1917–1928 (2014). https://doi.org/10.1007/s00521-013-1437-4
N. Nidheesh, K.A. Abdul Nazeer, P.M. Ameer, An enhanced deterministic K-means clustering algorithm for cancer subtype prediction from gene expression data. Comput. Biol. Med. 91, 213–221 (2017). https://doi.org/10.1016/j.compbiomed.2017.10.014
Z. Rustam, S. Hartini, Classification of breast cancer using fast fuzzy clustering based on kernel. IOP Conf. Ser. Mat. Sci. Eng. 546, 052067 (2019). https://doi.org/10.1088/1757-899X/546/5/052067
D. Graves, W. Pedrycz, Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study. Fuzzy Sets. Syst. 161(4), 522–543 (2010). https://doi.org/10.1016/j.fss.2009.10.021
H. Fritz, L.A. García-Escudero, A. Mayo-Iscar, Robust constrained fuzzy clustering. Inf. Sci. 245, 38–52 (2013). https://doi.org/10.1016/j.ins.2013.03.056
M. Abdullah et al., Hybrid multistage fuzzy clustering system for medical data classification (2018 International Conference on Computing Sciences and Engineering (ICCSE), Kuwait City, 2018), pp. 1–6. https://doi.org/10.1109/ICCSE1.2018.8374213
L. Bai, J. Liang, Y. Guo, An ensemble clusterer of multiple fuzzy K-means clusterings to recognize arbitrarily shaped clusters. IEEE Transactions on Fuzzy Systems (2018), pp. 1–1. https://doi.org/10.1109/TFUZZ.2018.2835774
R.J. Kuo, T.C. Lin, F.E. Zulvia, C.Y. Tsai, A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis. Appl. Soft. Comput. 67, 299–308 (2018). https://doi.org/10.1016/j.asoc.2018.02.039
L. Sun, J. Xu, J. Yin, An effective fuzzy kernel clustering analysis approach for gene expression data. Biomed. Mater. Eng. 26(s1), S1863–S1869 (2015). https://doi.org/10.3233/BME-151489
Y. Hu, C. Zuo, F. Qu, W. Shi, Unsupervised possibilistic clustering based on kernel methods. Phys. Procedia 25, 1084–1090 (2012). https://doi.org/10.1016/j.phpro.2012.03.203
U. Maulika, A. Mukhopadhyay, Simulated annealing based automatic fuzzy clustering combined with ann classification for analyzing microarray data, vol. 37 (Elsevier, 2010), pp. 1369–1380. https://doi.org/10.1016/j.cor.2009.02.025
H. **e, J. Li, Q. Zhang, Y. Wang, Comparison among dimensionality reduction techniques based on random projection for cancer classification. Comput. Biol. Chem. 65, 165–172 (2016). https://doi.org/10.1016/j.compbiolchem.2016.09.010
P.J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987). https://doi.org/10.1016/0377-0427(87)90125-7
Funding
This work was financially supported by DST India and MOST Israel.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
R, K.S., Kashyap, E. & Last, M. Kernel-Based Fuzzy Intuitionistic Possibilistic Clustering: Analyzing High-Dimensional Gene Expression Cancer Database. Data-Enabled Discov. Appl. 4, 4 (2020). https://doi.org/10.1007/s41688-020-00039-x
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41688-020-00039-x