Log in

An efficient stacking-based ensemble technique for early heart attack prediction

  • 1235: Supporting Medicine and Healthcare with Multimedia Tools
  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In many parts of the world, heart disease is the leading cause of death. Preventing or effectively managing cardiac disease often depends on its early detection. There has been a significant uptick in research towards using machine learning to estimate the probability of cardiovascular disease. Using a variety of classification methods and stacking as ensemble techniques, this work investigates the problem of predicting cardiovascular illness. A total of 1025 patients are used in the analysis, and their clinical data is broken down into 14 different categories (e.g., age, sex, chest pain kind, blood pressure, cholesterol levels, and more). The initial step of the analysis is to preprocess the data by filling in missing values, standardizing the numbers, and encoding the categories. After that, the information is segmented into a training set and a test set for the purposes of model building and testing. Logistic Regression, Decision Tree, Random Forest, Extreme Gradient Boost, Naive Bayes, and K-Nearest Neighbors (KNN) are the six classification methods used in the research. Accuracy, precision, recall, and F1-score are only some of the measures used to assess the efficacy of various classification methods. The findings reveal that random forest and decision tree both yields a 92.68% accuracy, with extreme gradient boost coming in as a close second at 90.73%. In the second portion of the research, ensemble approaches, and more especially stacking, are used to boost the classification models’ accuracy. The goal of stacking, a method of ensemble machine learning, is to increase prediction precision by using numerous models in concert. By training a meta-classifier on the predictions of the base models, stacking combines the predictions of multiple base models. The results demonstrate that stacking considerably enhances the efficiency of the original classifiers. The stacked model outperforms each individual classifier, with an accuracy of 98.53%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (France)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data and materials can be provided on request.

Abbreviations

IoT:

Internet of Things

XGBoost:

Extreme Gradient Boosting

ML:

Machine Learning

KNN:

K-Nearest Neighbors

RF:

Random Forest

LM:

Linear Model

PCA:

Principal Component Analysis

CHI:

Chi-square

CH:

Cleveland-Hungarian

SVM:

Support Vector Machine

RFBM:

Random Forest Bagging Method

FCMIM:

Fast Conditional Mutual Information

LOSO:

Leave-One-Subject-Out

HRFLM:

Hybrid Random Forest with Linear Model

LASSO:

Least Absolute Shrinkage and Selection Operator

NB:

Naive Bayes

BN:

Bayesian Network

MP:

Multilayer Perceptron

GLM:

Generalized Linear Model

LR:

Logistic Regression

DL:

Deep Learning

DT:

Decision Tree

GBT:

Gradient Boosted Trees

ANN:

Artificial neural network

NB:

Naive Bayes

TP:

True Positive

TN:

True Negative

FP:

False Positive

FN:

False Negative

RBF:

Radial Basis Function

References

  1. Bharti R, Khamparia A, Shabaz M, Dhiman G, Pande S, Singh P (2021) Prediction of heart disease using a combination of machine learning and deep learning. Computational intelligence and neuroscience 2021

  2. Mohan S, Thirumalai C, Gautam Srivastava (2019) Effective heart disease prediction using hybrid machine learning techniques. IEEE Access 7:81542–81554

    Article  Google Scholar 

  3. Bhagat M, Kumar D, Balgi SM (2021) Application of internet of things in Digital Pedagogy. In: Deyasi A, Mukherjee S, Mukherjee A, Bhattacharjee AK, Mondal A (eds) Computational intelligence in digital pedagogy. Intelligent systems reference library, vol 197. Springer, Singapore. https://doi.org/10.1007/978-981-15-8744-3_11

    Chapter  Google Scholar 

  4. Gárate-Escamila AK, El Hassani AH, Andrès E (2020) Classification models for heart disease prediction using feature selection and PCA. Inf Med Unlocked 19:100330

    Article  Google Scholar 

  5. Uddin S, Khan A, Hossain ME, Moni MA (2019) Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inf Decis Mak 19(1):1–16

    Google Scholar 

  6. Bhagat M, Kumar D (2023) Efficient feature selection using BoWs and SURF method for leaf disease identification. Multimed Tools Appl. https://doi.org/10.1007/s11042-023-14625-5

    Article  Google Scholar 

  7. Ramesh TR, Lilhore UK, Poongodi M, Simaiya S, Kaur A, Mounir Hamdi M (2022) Predictive analysis of heart diseases with machine learning approaches. Malays J Comput Sci 132–148

  8. Alotaibi FS (2019) Implementation of machine learning model to predict heart failure disease. Int J Adv Comput Sci Appl 10:6

    Google Scholar 

  9. Ghosh P, Azam S, Jonkman M, Karim A, Javed Mehedi Shamrat FM, Ignatious E, Shultana S, Beeravolu AR, De Boer F (2021) Efficient prediction of cardiovascular disease using machine learning algorithms with relief and LASSO feature selection techniques. IEEE Access 9:19304–19326

    Article  Google Scholar 

  10. Li JP, Haq AU, Din SU, Khan J, Khan A, Saboor A (2020) Heart disease identification method using machine learning classification in e-healthcare. IEEE Access 8:107562–107582

    Article  Google Scholar 

  11. Ali L, Niamat A, Khan JA, Golilarz NA, **ngzhong X, Noor A, Nour R, Bukhari SAC (2019) An optimized stacked support vector machines based expert system for the effective prediction of heart failure. IEEE Access 7:54007–54014

    Article  Google Scholar 

  12. Latha CBC, Jeeva SC (2019) Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inf Med Unlocked 16:100203

    Article  Google Scholar 

  13. Kumar M et al (2022) A comparative performance assessment of optimized multilevel ensemble learning model with existing classifier models. Big Data 10(5):371–387

    Article  Google Scholar 

  14. Saihood Q, Sonuç E (2023) A practical framework for early detection of diabetes using ensemble machine learning models. Turk J Electr Eng Comput Sci 31(4):722–738

    Article  Google Scholar 

  15. Reza MS, Amin R, Yasmin R, Kulsum W, Ruhi S (2024) Improving diabetes disease patients classification using stacking ensemble method with PIMA and local healthcare data. Heliyon 10(2) 

  16. Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64(5):402

    Article  Google Scholar 

  17. Bhagat M, Kumar D (2023) Performance evaluation of PCA based reduced features of leaf images extracted by DWT using random forest and XGBoost classifier. Multimed Tools Appl. https://doi.org/10.1007/s11042-023-14370-9

    Article  Google Scholar 

  18. Bhagat M, Kumar D (2022) A comprehensive survey on leaf disease identification & classification. Multimed Tools Appl 81:33897–33925. https://doi.org/10.1007/s11042-022-12984-z

    Article  Google Scholar 

  19. Myles AJ, Feudale RN, Liu Y, Woody NA, Brown SD (2004) An introduction to decision tree modeling. J Chemometr 18(6):275–285

    Article  Google Scholar 

  20. Bhagat M, Kumar D, Kumar S (2023) Bell pepper leaf disease classification with LBP and VGG-16 based fused features and RF classifier. Int J Inf Tecnol 15:465–475. https://doi.org/10.1007/s41870-022-01136-z

    Article  Google Scholar 

  21. Qin F, Liu D, Sun B, Ruan L, Ma Z, Wang H (2016) Identification of alfalfa leaf diseases using image recognition technology. PLoS ONE 11:1–26. https://doi.org/10.1371/journal.pone.0168274

    Article  Google Scholar 

  22. Kour VP, Arora S (2019) Particle swarm optimization-based support vector machine (P-SVM) for the segmentation and classification of plants. IEEE Access 7:29374–29385

    Article  Google Scholar 

  23. Gupta A, Jain V, Singh A (2022) Stacking ensemble-based intelligent machine learning model for predicting post-COVID-19 complications. New Gener Comput 40:987–1007. https://doi.org/10.1007/s00354-021-00144-0

    Article  Google Scholar 

  24. Sharma N, Dev J, Mangla M, Wadhwa VM, Mohanty SN, Kakkar D (2021) A heterogeneous ensemble forecasting model for disease prediction. New Gener Comput 1–15

Download references

Acknowledgements

The authors would like to express their gratitude to the reviewers who provided valuable and insightful feedbacks.

Funding

This study received no specific financing from governmental, private, or non-profit funding bodies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Monu Bhagat.

Ethics declarations

Conflict of interest

There are no conflicts of interest declared by the authors.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhagat, M., Sharma, A. & Agarwal, P. An efficient stacking-based ensemble technique for early heart attack prediction. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19293-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11042-024-19293-7

Keywords

Navigation