Log in

RoughSet based Feature Selection for Prediction of Breast Cancer

  • Published:
Wireless Personal Communications Aims and scope Submit manuscript

Abstract

Breast cancer is the most deadly cancer and has highest mortality rate in women all over the world. Early prediction of breast cancer can improve the survival rate of the patient. Consequently, high accuracy in cancer prediction is important to avoid any mis-diagnosis. Machine learning algorithms can contribute in early prediction and diagnosis of breast cancer. In this study, we have used rough set based feature selector to extract relevant features from the breast cancer feature set and classify them using machine learning algorithm like Decision Tree, Naive Bayes, Support Vector Machine, K-Nearest Neighbor, Logistic Regression, Random Forest, Adaboost. The main aim is to predict cancerous breast nodules, using rough set driven feature selection and machine learning classification algorithms. The results were evaluated pertaining to accuracy, sensitivity and specificity and positive predictive value. It is observed that random forest outperformed all other classifiers and achieved the highest accuracy using the proposed approach (95.23%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

The datasets generated during the current study are available from the corresponding author on request.

Code Availability

The PYTHON codes used during the current study are available from the corresponding author on request.

References

  1. Kumari, V., Ahmed, A., Kanumuri, T., Shakher, C., & Sheoran, G. (2020). Early detection of cancerous tissues in human breast utilizing near field microwave holography. International Journal of Imaging Systems and Technology, 30, 391–400. https://doi.org/10.1002/ima.22384

    Article  Google Scholar 

  2. Martinez-del-Rincon, J., Santofimia, M. J., del Toro, X., et al. (2017). Nonlinear classifiers applied to EEG analysis for epilepsy seizure detection. Expert Systems with Applications, 86, 99–112.

    Article  Google Scholar 

  3. Labrèche, F., Goldberg, M.S., Hashim, D., Weiderpass, E. (2020). Breast cancer. In Occupational Cancers, Springer, Berlin/Heidelberg, Germany, pp. 417–438

  4. Kumar, V., Misha, B.K., Mazzara, M., Thanh, D.N., Verma, A. (2019) Prediction of malignant and benign breast cancer: A data mining approach in healthcare applications. In Advances in Data Science and Management, Springer, Berlin/Heidelberg, Germany, , pp. 435–442

  5. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R. L., Torre, L. A., & Jemal, A. (2018). Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal For Clinicians, 68(6), 394–424.

    Google Scholar 

  6. Melekoodappattu, J. G., & Subbian, P. S. (2019). A hybridized ELM for automatic micro calcification detection in mammogram images based on multi-scale features. Journal of medical systems, 43(7), 183. https://doi.org/10.1007/s10916-019-1316-3

    Article  Google Scholar 

  7. Parsian, A., Ramezani, M., & Ghadimi, N. (2017). A hybrid neural network gray wolf optimization algorithm for melanoma detection. Biomedical Research, 28(8), 3408–3411.

    Google Scholar 

  8. Luque, C., Luna, J. M., Luque, M., & Ventura, S. (2019). An advanced review on text mining in medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), e1302.

    Google Scholar 

  9. Hassan, M., & Hamada, M. (2018). Genetic algorithm approaches for improving prediction accuracy of multi-criteria recommender systems. International Journal of Computational Intelligence Systems, 11(1), 146–162.

    Article  Google Scholar 

  10. Tanimu, J.J., Hamada, M., Hassan, M., Yusuf, S.I. (2021) A contemporary machine learning method for accurate prediction of cervical cancer. In Proceedings of the 3rd ETLT 2021. ACM International Conference on Information and Communication Technology, Aizu, Japan, p. 04004

  11. Abba, A.H., Hassan, M., (2018) Design and implementation of a CSV validation system. In Proceedings of the 3rd international Conference on Applications in information Technology, Wakamatsu, Japan, pp. 111–116

  12. Osianwo, F. Y., Akinsola, J. E. T., Awodele, O., Hinimikaiye, J. O., Olakanmi, O., & Akiniobi, J. (2017). Supervised machine learning algorithm: Classification and comparisiom. International Journal of Computer Trends and Technology, 3, 128–138.

    Google Scholar 

  13. Bazazeh, D., Shubair, R. (2017) Comparative study of machine learning algorithms for breast cancer detection and diagnosis. In Proceedings of the 2017 International Conference on Electronic Devices, Systems, and Applications, Kuching, Malaysia, pp. 2–5

  14. Boeri, C., Chiappa, C., Galli, F., de Berardinis, V., Bardelli, L., Carcano, G., & Rovera, F. (2020). Machine learning techniques in breast cancer prognosis prediction: A primary evaluation. Cancer Medicine, 9, 3234–3243.

    Article  Google Scholar 

  15. Sakri, S. B., Rashid, N. B. A., & Zain, Z. M. (2018). Particle swarm optimization feature selection for breast cancer recurrence prediction. IEEE Access, 6, 29637–29647.

    Article  Google Scholar 

  16. Ni, Q., Stevic, I., Pan, C., et al. (2018). Different signatures of miR-16, miR-30b and miR-93 in exosomes from breast cancer and DCIS patients. Science and Reports, 8(1), 12974.

    Article  Google Scholar 

  17. Ricciardi, C., Valente, S. A., Edmund, K., Cantoni, V., Green, R., Fiorillo, A., Picone, I., Santini, S., & Cesarelli, M. (2020). Linear discriminant analysis and principal component analysis to predict coronary artery disease. Health Informatics Journal, 26, 2181–2192.

    Article  Google Scholar 

  18. Bader Alazzam, M., Mansour, H., Hammam, M. M., et al. (2021). machine learning of medical applications involving complicated proteins and genetic measurements. Computational Intelligence and Neuroscience, 2021, 1–6.

    Article  Google Scholar 

  19. Dhanya, R., Paul, I. R., Sindhu Akula, S., Sivakumar, M., & Nair J. J. (2019) A comparative study for breast cancer prediction using machine learning and feature selection. In 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1049–1055

  20. Islam, M. M., Iqbal, H., Haque, M. R., & Hasan, M. K. (2017) Prediction of breast cancer using support vector machine and K-Nearest neighbors. In 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), pp. 226–229

  21. MurtiRawat, R., Panchal, S., Singh, V. K., & Panchal, Y. (2020). Breast cancer detection using k-nearest neighbors, logistic regression and ensemble learning. International Conference on Electronics and Sustainable Communication Systems (ICESC), 2020, 534–540. https://doi.org/10.1109/ICESC48915.2020.9155783

    Article  Google Scholar 

  22. Bazazeh, D., & Shubair, R. (2016) Comparative study of machine learning algorithms for breast cancer detection and diagnosis. In 2016 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA), pp. 1–4

  23. Jain, R., & Mazumdar, J. (2003). A genetic algorithm based nearest neighbor classification to breast cancer diagnosis. Australasian Physical and Engineering Sciences in Medicine, 26, 6.

    Article  Google Scholar 

  24. Aličković, E., & Subasi, A. (2015). Breast cancer diagnosis using GA feature selection and Rotation Forest. Neural Computing and Applications, 28, 753–763.

    Article  Google Scholar 

  25. Zhao, Z., Li, X., Luan, B., Jiang, W., & Gao, W. (2023). Secure internet of things (IoT) using a novel brooks iyengar quantum byzantine agreement-centered lockchain networking (BIQBA-BCN) model in smart healthcare. Information Sciences. https://doi.org/10.1016/j.ins.2023.01.020

    Article  Google Scholar 

  26. daoudyvan, A., & Maalmi, K. (2020). Breast cancer classification with reduced feature set using association rules and support vector machine. Network Modeling Analysis in Health Informatics and Bioinformatics, 9, 34.

    Article  Google Scholar 

  27. Kavitha, T., Mathai, P. P., Karthikeyan, C., et al. (2021). Deep learning based capsule neural network model for breast cancer diagnosis using mammogram images. Interdisciplinary Sciences: Computational Life Sciences. https://doi.org/10.1007/s12539-021-00467-y

    Article  Google Scholar 

  28. El Rahman, S. A. (2021). Predicting breast cancer survivability based on machine learning and features selection algorithms: a comparative study. Journal of Ambient Intelligence and Humanized Computing, 12, 8585–8623.

    Article  Google Scholar 

  29. Kamel, S. R., YaghoubZadeh, R., & Kheirabadi, M. (2019). Improving the performance of support-vector machine by selecting the best features by Gray Wolf algorithm to increase the accuracy of diagnosis of breast cancer. Journal of Big Data, 6, 90.

    Article  Google Scholar 

  30. Partheepan, R., Walia, R., & Chandra Shekar Rao, V. (2022). Multilayer stacked probabilistic belief network-based brain tumor segmentation and classification. International Journal of Foundations of Computer Science. https://doi.org/10.1142/S0129054122420047

    Article  Google Scholar 

  31. Sharma, A., & Mishra, P. K. (2021). Performance analysis of machine learning based optimized feature selection approaches for breast cancer diagnosis. International Journal of Information Technology, 14(4), 1949–1960.

    Article  Google Scholar 

  32. Hu, Q., Whitney, H. M., & Giger, M. L. (2020). A deep learning methodology for improved breast cancer diagnosis using multiparametric MRI. Science and Reports, 10(1), 1–11.

    Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hanumanthu Bhukya.

Ethics declarations

Conflict of interest

The authors have not disclosed any competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhukya, H., Sadanandam, M. RoughSet based Feature Selection for Prediction of Breast Cancer. Wireless Pers Commun 130, 2197–2214 (2023). https://doi.org/10.1007/s11277-023-10378-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11277-023-10378-4

Keywords

Navigation