Abstract
This paper presents a novel strategy for gene selection in cancer classification by integrating the Northern Goshawk Algorithm (NGHA) with microarray datasets. The primary objective is to enhance the precision of cancer diagnosis by efficiently identifying informative genes within the often-noisy nature of these datasets. The proposed method consists of two key stages: the filter stage, employing the Minimum Redundancy Maximum Relevancy (mRMR) method, and the wrapper stage, harnessing the synergies of NGHA and Support Vector Machine (SVM) classifier. Experimental assessments on two microarray datasets demonstrate the method's accuracy and effectiveness. Comparative evaluations against common gene selection techniques indicate comparable performance on several datasets, with particularly promising and novel results on two datasets. The innovative integration of mRMR and the Northern Goshawk Algorithm presents a potent strategy for advancing gene selection in cancer classification, holding significant potential for elevating diagnostic precision in oncology.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Figa_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig4_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig5_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs41870-024-01849-3/MediaObjects/41870_2024_1849_Fig6_HTML.png)
Similar content being viewed by others
Data availability
The data used in this research project is publicly available.
References
Almugren N, Alshamlan H (2019) A survey on hybrid feature selection methods in microarray gene expression data for cancer classification. IEEE Access 7(78533):78548. https://doi.org/10.1109/ACCESS.2019.2922987
Yaqoob A, Aziz RM, Verma NK, Lalwani P, Makrariya A (2023) A review on nature-inspired algorithms for cancer disease prediction and classification. Mathematics. https://doi.org/10.3390/math11051081
Aziz R, Verma CK, Srivastava N (2018) artificial neural network classification of high dimensional data with novel optimization approach of dimension reduction. Ann Data Sci 5(4):615–635. https://doi.org/10.1007/s40745-018-0155-2
Aziz R, Verma CK, Jha M, Srivastava N (2017) Artificial neural network classification of microarray data using new hybrid gene selection method. Int J Data Min Bioinform 17(1):42–65. https://doi.org/10.1504/IJDMB.2017.084026
Yaqoob A, Musheer Aziz R, N. K. verma, (2023) Applications and techniques of machine learning in cancer classification a systematic review. Human-Centric Intell Syst. https://doi.org/10.1007/s44230-023-00041-3
P. L. Kennedy, “The Northern Goshawk (Accipiter Gentilis Atricapillus): Is There Evidence Of A Population Decline? Department of Fishery and Wildlife Biology and Graduate Degree Program in Ecology,” J. Raptor Res, vol. 31, no. 2, pp. 95–10, 1997, [Online]. Available: https://sora.unm.edu/sites/default/files/journals/jrr/v031n02/p00095-p00106.pdf
Gokhale M, Mohanty SK, Ojha A (2023) genevit: gene vision transformer with improved deepinsight for cancer classification. Comput Biol Med 155:106643. https://doi.org/10.1016/j.compbiomed.2023.106643
Shaban WM (2023) Insight into breast cancer detection: new hybrid feature selection method. Neural Comput Appl 35(9):6831–6853. https://doi.org/10.1007/s00521-022-08062-y
Ali W, Saeed F (2023) “Hybrid filter and genetic algorithm-based feature selection for improving cancer classification in high-dimensional microarray data. Processes. https://doi.org/10.3390/pr11020562
Amin J, Sharif M, Haldorai A, Yasmin M, Nayak RS (2022) Brain tumor detection and classification using machine learning: a comprehensive survey. Complex Intell Syst 8(4):3161–3183. https://doi.org/10.1007/s40747-021-00563-y
Sahu A, Das PK, Meher S (2023) High accuracy hybrid CNN classifiers for breast cancer detection using mammogram and ultrasound datasets. Biomed Signal Process Control 80(P1):104292. https://doi.org/10.1016/j.bspc.2022.104292
Yang L, Peng S, Yahya RO, Qian L (2023) Cancer detection in breast cells using a hybrid method based on deep complex neural network and data mining. J Cancer Res Clin Oncol. https://doi.org/10.1007/s00432-023-05191-2
Sowan B, Eshtay M, Dahal K, Qattous H, Zhang L (2023) Hybrid PSO feature selection-based association classification approach for breast cancer detection. Neural Comput Appl 35(7):5291–5317. https://doi.org/10.1007/s00521-022-07950-7
Lorencin I, Anđelić N, Španjol J, Car Z (2019) Using multi-layer perceptron with Laplacian edge detector for bladder cancer diagnosis. Artif Intell Med 102(May):2020. https://doi.org/10.1016/j.artmed.2019.101746
Soares F, Becker K, Anzanello MJ (2017) A hierarchical classifier based on human blood plasma fluorescence for non-invasive colorectal cancer screening. Artif Intell Med 82:1–10. https://doi.org/10.1016/j.artmed.2017.09.004
Shirwaikar RD, D. Acharya U, K. Makkithaya, S. M, S. Srivastava, and L. E. S. Lewis U, (2019) Optimizing neural networks for medical data sets: a case study on neonatal apnea prediction. Artif Intell Med 98:59–76. https://doi.org/10.1016/j.artmed.2019.07.008
Hosseini MP, Pompili D, Elisevich K, Soltanian-Zadeh H (2018) Random ensemble learning for EEG classification. Artif Intell Med 84:146–158. https://doi.org/10.1016/j.artmed.2017.12.004
Khandezamin Z, Naderan M, Rashti MJ (2020) Detection and classification of breast cancer using logistic regression feature selection and GMDH classifier. J Biomed Inform 111:103591. https://doi.org/10.1016/j.jbi.2020.103591
Aziz RM, Baluch MF, Patel S, Ganie AH (2022) LGBM: a machine learning approach for Ethereum fraud detection. Int J Inf Technol 14(7):3321–3331. https://doi.org/10.1007/s41870-022-00864-6
Khan I, Naqvi SK, Alam M, Rizvi SNA (2017) An efficient framework for real-time tweet classification. Int J Inf Technol. https://doi.org/10.1007/s41870-017-0015-x
Mahajan A, Sukavanam HPSN (2017) An unsupervised learning based neural network approach for a robotic manipulator. Int J Inf Technol 9:1–6. https://doi.org/10.1007/s41870-017-0002-2
Gupta S, Gupta R, Singla C (2017) Analysis of image enhancement techniques for astrocytoma MRI images. Int J Inf Technol. https://doi.org/10.1007/s41870-017-0033-8
Kumar PD (2017) Feature selection for face recognition using DCT-PCA and Bat algorithm. Int J Inf Technol. https://doi.org/10.1007/s41870-017-0051-6
Alshamlan HM, Badr GH, Alohali YA (2015) Genetic Bee Colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput Biol Chem 56:49–60. https://doi.org/10.1016/j.compbiolchem.2015.03.001
El-Dabah MA, El-Sehiemy RA, Hasanien HM, Saad B (2023) Photovoltaic model parameters identification using Northern Goshawk Optimization algorithm. Energy 262:125522. https://doi.org/10.1016/j.energy.2022.125522
Liang Y, Hu X, Hu G, Dou W (2022) An enhanced northern goshawk optimization algorithm and its application in practical optimization problems. Mathematics 10:22. https://doi.org/10.3390/math10224383
Dehghani M, Hubalovsky S, Trojovsky P (2021) Northern goshawk optimization: a new swarm-based algorithm for solving optimization problems. IEEE Access 9:162059–162080. https://doi.org/10.1109/ACCESS.2021.3133286
Wang J, **ang Z, Cheng X, Zhou J, Li W (2023) Tool wear state identification based on svm optimized by the improved northern goshawk optimization. Sensors (Basel) 23:20. https://doi.org/10.3390/s23208591
Satria H, Syah RBY, Nehdi ML, Almustafa MK, Adam AOI (2023) Parameters identification of solar pv using hybrid chaotic northern goshawk and pattern search. Sustain 15:6. https://doi.org/10.3390/su15065027
Keerthana D, Venugopal V, Nath MK, Mishra M (2023) Hybrid convolutional neural networks with SVM classifier for classification of skin cancer. Biomed Eng Adv 5:100069. https://doi.org/10.1016/j.bea.2022.100069
Wang R (2012) AdaBoost for feature selection, classification and its relation with svm, a review. Phys Procedia 25:800–807. https://doi.org/10.1016/j.phpro.2012.03.160
Alshamlan H, Badr G, Alohali, Y (2015) mRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed Res Int 2015
Yaqoob A, Verma NK, Aziz RM (2024) Optimizing gene selection and cancer classification with hybrid sine cosine and cuckoo search algorithm. J Med Syst 48(1):10
Yaqoob A, Bhat MA, Khan Z (2023) Dimensionality reduction techniques and their applications in cancer classification: a comprehensive review. Int J Genet Modif Recomb 1(2):34–45
Ramos B, Pereira T, Moranguinho J, Morgado J, Costa JL, Oliveira HP (2021) November. An interpretable approach for lung cancer prediction and subtype classification using gene expression. In: 2021 43rd annual international conference of the IEEE engineering in medicine & biology society (EMBC). IEEE, pp 1707–1710
Algamal ZY, Lee MH (2015) Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Comput Biol Med 67:136–145
Funding
No funding Available.
Author information
Authors and Affiliations
Contributions
Abrar Yaqoob played a pivotal role in data analysis, drawing meaningful insights from the gathered information. Their meticulous attention to detail and analytical skills greatly enriched the project. Furthermore, the author asserts that there are no conflicts of interest to declare, ensuring the integrity and impartiality of the research findings.
Corresponding author
Ethics declarations
Conflict of interest
The author declares that they have no conflicts of interest to disclose.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yaqoob, A. Combining the mRMR technique with the Northern Goshawk Algorithm (NGHA) to choose genes for cancer classification. Int. j. inf. tecnol. (2024). https://doi.org/10.1007/s41870-024-01849-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s41870-024-01849-3