Abstract
Dengue, a virus transmitted by mosquitoes, is a growing global health concern given its escalating incidence, significant death rates, and intense clinical signs. Despite the severity of the disease, there are substantial difficulties in accurately classifying dengue patients, correct patient classification remains problematic, which is crucial for timely intervention and patient management. In this study, we propose a comprehensive method for classifying 248 dengue-positive cases and 252 dengue-negative patients using tabular complete blood count (CBC) data from two different hospitals. There are missing variables in the dataset which is handled using Multivariate Imputation by Chained Equations (MICE) algorithm. This involves rigorous data preprocessing like data cleansing, statistical analysis, and missing data imputation. Among the different CBC parameters and demographic variables, by employing feature ranking and selection techniques, we are able to identify key characteristics. Thirteen classical machine learning (ML) models were trained for 5-fold cross-validation and finally, a Stacking-Based Meta-Classifier was trained using three top-performing model for Dengue patient identification along with a Nomogram-Based Scoring System. Extra Tree, Adaboost, and CatBoost Meta classifiers excel in their in-performance metrics. XGBoost Meta classifier achieves the highest F1-score of 97.8%. The Area Under Receiver Operating Characteristic Curve (ROC-AUC) scores are 0.976 for AdaBoost and 0.972 for Extra Tree and CatBoost, while the XGBoost meta-classifier attains an AUC score of 0.978. Shapley values shed light on feature contribution characteristics. Our proposed approach offers a robust framework for reliable dengue detection, facilitating timely medical response and easing the burden on health systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M.G. Guzman et al., Dengue: A continuing global threat. Nat. Rev. Microbiol. 8(12), S7–S16 (2010). https://doi.org/10.1038/nrmicro2460
J.R. Powell, Mosquito-borne human viral diseases: Why Aedes aegypti?, (in eng). Am. J. Trop. Med. Hygi. 98(6), 1563–1565 (2018). https://doi.org/10.4269/ajtmh.17-0866
W. H. O. Who, Dengue and severe dengue, (in English), World Health Organization: WHO (2023). [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/dengue-and-severe-dengue
J. Slosek, Aedes aegypti mosquitoes in the Americas: A review of their interactions with the human population, (in eng). Soc. Sci. Med. 23(3), 249–257 (1986). https://doi.org/10.1016/0277-9536(86)90345-x
M.J. Hopp, J.A. Foley, Global-scale relationships between climate and the dengue fever vector, Aedes Aegypti. Clim. Chang. 48(2), 441–463 (2001). https://doi.org/10.1023/A:1010717502442
E.B. Beserra, C.R.M. Fernandes, S.A.D.O. Silva, L.A.D. Silva, J.W.D. Santos, Efeitos da temperatura no ciclo de vida, exigências térmicas e estimativas do número de gerações anuais de Aedes aegypti (Diptera, Culicidae). Iheringia. Série Zoologia 99, 142--148 (2009)
E.A.P.D.A. Costa, E.M.D.M. Santos, J.C. Correia, C.M.R.D. Albuquerque, Impact of small variations in temperature and humidity on the reproductive activity and survival of Aedes aegypti (Diptera, Culicidae). Revista Brasileira de Entomologia 54, 488--493 (2010)
C.P. Simmons, J.J. Farrar, N. van Vinh Chau, B. Wills, Dengue. N. Engl. J. Med. 366(15), 1423–1432 (2012). https://doi.org/10.1056/NEJMra1110265
A. Wilder-Smith, D.J. Gubler, Geographic expansion of dengue: The impact of international travel. Med. Clin. North Am. 92(6), 1377–1390 (2008). https://doi.org/10.1016/j.mcna.2008.07.002
H.A. Karam, J.C.B. da Silva, A.J.P. Filho, J.L.F. Rojas, Dynamic modelling of dengue epidemics in function of available enthalpy and rainfall (in English). Open J. Epidemiol 6(1), 50–79 (2015). https://doi.org/10.4236/ojepi.2016.61007
B. Byttebier, M.S. De Majo, S. Fischer, Hatching response of Aedes aegypti (Diptera: Culicidae) eggs at low temperatures: Effects of hatching media and storage conditions. J. Med. Entomol. 51(1), 97–103 (2014). https://doi.org/10.1603/me13066
V.J. Lee, D.C.B. Lye, Y. Sun, G. Fernandez, A. Ong, Y.S. Leo, Predictive value of simple clinical and laboratory variables for dengue hemorrhagic fever in adults. J. Clin. Virol. 42(1), 34–39 (2008). https://doi.org/10.1016/j.jcv.2007.12.017
D. Muller, P. Young, Molecular Virology and Control of Flaviviruses (Caister Academic Press, 2012)
B. Shenoy, A. Menon, S. Biradar, Diagnostic utility of dengue NS1 antigen. Pediatr. Infect. Dis. 6(3), 110–113 (2014)
H. Zhang et al., NS1-based tests with diagnostic utility for confirming dengue infection: A meta-analysis. Int. J. Infect. Dis. 26, 57–66 (2014)
T. Rahman et al., Transfer learning with deep convolutional neural network (CNN) for pneumonia detection using chest X-ray. Appl. Sci. 10(9), 3233 (2020) [Online]. Available: https://www.mdpi.com/2076-3417/10/9/3233
M.H. Chowdhury et al., Estimating blood pressure from the Photoplethysmogram signal and demographic features using machine learning techniques. Sensors 20(11), 3127 (2020) [Online]. Available: https://www.mdpi.com/1424-8220/20/11/3127
M.E.H. Chowdhury et al., An early warning tool for predicting mortality risk of COVID-19 patients using machine learning. Cogn. Comput. (2021). https://doi.org/10.1007/s12559-020-09812-7
T. Rahman et al., Mortality prediction utilizing blood biomarkers to predict the severity of COVID-19 using machine learning technique, (in eng). Diagnostics (Basel, Switzerland) 11(9) (2021). https://doi.org/10.3390/diagnostics11091582
M.N.I. Shuzan et al., Machine learning-based respiration rate and blood oxygen saturation estimation using Photoplethysmogram signals, (in eng). Bioengineering (Basel, Switzerland) 10(2) (2023). https://doi.org/10.3390/bioengineering10020167
M.A. Majeed, H.Z.M. Shafri, Z. Zulkafli, A. Wayayok, A deep learning approach for dengue fever prediction in Malaysia using LSTM with spatial attention. Int. J. Environ. Res. Public Health 20(5), 4130 (2023) [Online]. Available: https://www.mdpi.com/1660-4601/20/5/4130
F.P. Rocha, M. Giesbrecht, Machine learning algorithms for dengue risk assessment: A case study for São Luís do Maranhão. Comput. Appl. Math. 41(8), 393 (2022). https://doi.org/10.1007/s40314-022-02101-z
S.N. Manoharan, K.M.V.M. Kumar, N. Vadivelan, A novel CNN-TLSTM approach for dengue disease identification and prevention using IoT-fog cloud architecture. Neural. Process. Lett. 55(2), 1951–1973 (2023). https://doi.org/10.1007/s11063-022-10971-x
H. Mayrose, G.M. Bairy, N. Sampathila, S. Belurkar, K. Saravu, Machine learning-based detection of dengue from blood smear images utilizing platelet and lymphocyte characteristics. Diagnostics 13(2), 220 (2023) [Online]. Available: https://www.mdpi.com/2075-4418/13/2/220
D. Sarma, S. Hossain, T. Mittra, M.A.M. Bhuiya, I. Saha, R. Chakma, Dengue prediction using machine learning algorithms, in 2020 IEEE 8th R10 Humanitarian Technology Conference (R10-HTC), (2020), p. 1–6. https://doi.org/10.1109/R10-HTC49770.2020.9357035
A.L.V. Gomes et al., Classification of dengue fever patients based on gene expression data using support vector machines. PLoS One 5(6), e11267 (2010). https://doi.org/10.1371/journal.pone.0011267
W. Caicedo-Torres, Á. Paternina, H. Pinzón, Machine learning models for early dengue severity prediction, in Advances in Artificial Intelligence - IBERAMIA 2016, ed. by M.M. Cham, G.H.J. Escalante, A. Segura, J.D.D. Murillo, (Springer International Publishing, 2016), pp. 247–258
J.D. Mello-Román, J.C. Mello-Román, S. Gómez-Guerrero, M. García-Torres, Predictive models for the medical diagnosis of dengue: A case study in Paraguay, (in eng). Comput. Math. Methods Med. 2019, 7307803 (2019). https://doi.org/10.1155/2019/7307803
T. Chakraborty, S. Chattopadhyay, I. Ghosh, Forecasting dengue epidemics using a hybrid methodology. Phys. A Statist. Mech. Appl. 527, 121266 (2019). https://doi.org/10.1016/j.physa.2019.121266
D.K. Ming et al., The diagnosis of dengue in patients presenting with acute febrile illness using supervised machine learning and impact of seasonality, (in English). Front. Digit. Health, Original Research 4 (2022). https://doi.org/10.3389/fdgth.2022.849641
H. Hegde, N. Shimpi, A. Panny, I. Glurich, P. Christie, A. Acharya, MICE vs PPCA: Missing data imputation in healthcare. Inform. Med. Unlock. 17, 100275 (2019). https://doi.org/10.1016/j.imu.2019.100275
J.R. Stevens, A. Suyundikov, M.L. Slattery, Accounting for missing data in clinical research, (in eng). JAMA 315(5), 517–518 (2016). https://doi.org/10.1001/jama.2015.16461
T. Dahiru, P - value, a true test of statistical significance? A cautionary note, (in eng). Ann. Ibadan Postgrad. Med. 6(1), 21–26 (2008). https://doi.org/10.4314/aipm.v6i1.64038
T.K. Kim, T test as a parametric statistic. kja 68(6), 540–546 (2015). https://doi.org/10.4097/kjae.2015.68.6.540
J. Cuzick, A Wilcoxon-type test for trend, (in eng). Stat. Med. 4(1), 87–90 (1985). https://doi.org/10.1002/sim.4780040112
D. Singh, B. Singh, Investigating the impact of data normalization on classification performance. Appl. Soft Comput. 97, 105524 (2020). https://doi.org/10.1016/j.asoc.2019.105524
N.V. Chawla, K.W. Bowyer, L.O. Hall, W.P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
S.S.S.J. Surendiran, N. Yuvaraj, M. Ramkumar, C.N. Ravi, R.G. Vidhya, Classification of diabetes using multilayer perceptron, in 2022 IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), 23–24 April 2022, (2022), pp. 1–5. https://doi.org/10.1109/ICDCECE53908.2022.9793085
D.K. Choubey, M. Kumar, V. Shukla, S. Tripathi, V.K. Dhandhania, Comparative analysis of classification methods with PCA and LDA for diabetes. Curr. Diabetes Rev. 16(8), 833–850 (2020)
M.A.A. Faisal et al., An investigation to study the effects of Tai Chi on human gait dynamics using classical machine learning. Comput. Biol. Med. 142, 105184 (2022)
F. Haque et al., A machine learning-based severity prediction tool for the Michigan neuropathy screening instrument. Diagnostics 13(2), 264 (2023)
S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, K.R.K. Murthy, Improvements to Platt's SMO algorithm for SVM classifier design. Neural Comput. 13(3), 637–649 (2001). https://doi.org/10.1162/089976601300014493
A. Sharaff, H. Gupta, Extra-tree classifier with metaheuristics approach for email classification, in Advances in Computer Communication and Computational Sciences: Proceedings of IC4S 2018, (Springer, 2019), pp. 189–197
A. Khandakar et al., A machine learning model for early detection of diabetic foot using thermogram images. Comput. Biol. Med. 137, 104838 (2021). https://doi.org/10.1016/j.compbiomed.2021.104838
G. Guo, H. Wang, D. Bell, Y. Bi, K. Greer, KNN model-based approach in classification, in On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE, ed. by R. Meersman, Z. Tari, D.C. Schmidt, (Springer, Berlin, Heidelberg, 2003), pp. 986–996
A. Natekin, A. Knoll, Gradient boosting machines, a tutorial. Front. Neurorobot. 7, 21 (2013)
M. Al-Sarem, F. Saeed, W. Boulila, A.H. Emara, M. Al-Mohaimeed, M. Errais, Feature selection and classification using CatBoost method for improving the performance of predicting Parkinson’s disease, in Advances on Smart and Soft Computing: Proceedings of ICACIn 2020, (Springer, 2021), pp. 189–199
M. Pal, Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26(1), 217–222 (2005). https://doi.org/10.1080/01431160412331269698
S.M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, in NIPS'17: Proceedings of the 31st International Conference on Neural Information Processing Systems, (Curran Associates Inc., Red Hook, 2017), pp. 4768–4777
An introduction to explainable AI with Shapley values — SHAP latest documentation, ed (2023)
A. Zlotnik, V. Abraira, A general-purpose nomogram generator for predictive logistic regression models. Stata J. 15(2), 537–546 (2015) [Online]. Available: https://econpapers.repec.org/article/tsjstataj/v_3a15_3ay_3a2015_3ai_3a2_3ap_3a537-546.htm
N. Ibtehaz, M.E.H. Chowdhury, A. Khandakar, S. Kiranyaz, M.S. Rahman, S.M. Zughaier, RamanNet: A generalized neural network architecture for Raman spectrum analysis. Neural Comput. & Applic. 35(25), 18719–18735 (2023). https://doi.org/10.1007/s00521-023-08700-z
X. Yang, M.B. Quam, T. Zhang, S. Sang, Global burden for dengue and the evolving pattern in the past 30 years. J. Travel Med. 28(8), taab146 (2021)
K.K. Bhowmik, J. Ferdous, P.K. Baral, M.S. Islam, Recent outbreak of dengue in Bangladesh: A threat to public health, (in eng). Health Sci. Rep. 6(4), e1210 (2023). https://doi.org/10.1002/hsr2.1210
ACAPS Briefing note - Bangladesh 2023 Dengue Outbreak (26 September 2023) - Bangladesh, ed (2023)
T. Rahman et al., QCovSML: A reliable COVID-19 detection system using CBC biomarkers by a stacking machine learning model. Comput. Biol. Med. 143, 105284 (2022). https://doi.org/10.1016/j.compbiomed.2022.105284
T.-S. Ho et al., Comparing machine learning with case-control models to identify confirmed dengue cases. PLoS Negl. Trop. Dis. 14(11), e0008843 (2020). https://doi.org/10.1371/journal.pntd.0008843
Code and Data Availability Statement
Code for data cleaning and analysis is provided as part of the replication package and is available at https://github.com/Sohan2087/A-Stacking-Ensemble-Approach-for-Robust-Dengue-Patient-Detection-from-Complete-Blood-Count-Data.
Conflicts of Interest
Authors have no conflict of interest to declare.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Rahman, M.S. et al. (2024). A Stacking Ensemble Approach for Robust Dengue Patient Detection from Complete Blood Count Data. In: Chowdhury, M.E.H., Kiranyaz, S. (eds) Surveillance, Prevention, and Control of Infectious Diseases. Springer, Cham. https://doi.org/10.1007/978-3-031-59967-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-59967-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-59966-8
Online ISBN: 978-3-031-59967-5
eBook Packages: Computer ScienceComputer Science (R0)