Abstract
Purpose: This paper utilizes data mining to assist policy makers to better understand the overall health of communities in Chicago area using several public health indicators. The work utilizes regression analysis to establish relationships between social, economic and heath variables.
Design/Methodology/Approach: The main goal of the basic analysis was to identify several variables of interest for further investigation by multiple regression analysis. A correlation matrix from R was used to visualize associations between all independent variables and the dependent variables of interest for this study. The Akaike information criterion (AIC) value was then determined using backward variable selection. To classify the Chicago community areas according to similarities, k-means clustering was utilized in R. The data was transformed into a matrix and scaled.
Findings: The study found that socio-economic factors such as unemployment and crowded housing contribute to the increase in teen birth rate in Chicago community areas. This indicates that financial problems due to unemployment could lead to teenage pregnancies. The study reveals that assault, cancer, diabetes, and infant mortality all contribute to the increase in death rates in Chicago communities. In addition, unemployment and having no high school diploma is associated with communities being rated below the poverty line.
Research Limitations: The challenge in obtaining spatial statistics data for the communities was a major limitation. The spatial data affords a good way to do clustering and visualize the similarities in the communities.
Social Implications: For sustainable industrialization within the community, issues of public health are central. This will impliedly require health sector officials to focus more on the significant health issues and educate residents.
Originality: The innovation of the use of a simple data mining technique to assess the public health of communities in the Chicago area is unique to other methods employed in the literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bellinger, C., Mohomed Jabbar, M., Zaïane, O., Osornio-Vargas, A.: A systematic review of data mining and machine learning for air pollution epidemiology. BMC Public Health 17(1), 907 (2017). https://doi.org/10.1186/s12889-017-4914-3
Blanchard, T.C., Tolbert, C., Mencken, C.: The health and wealth of US counties: how the small business environment impacts alternative measures of development. Cambridge J. Reg. Econ. Soc. 5(1), 149–162 (2012)
Chicago Department of Public Health, 2020.Chicago Department of Public Health: Healthy Chicago 2025: closing our life expectancy gap 2020-2025. Chicago Department of Public Health, Chicago (2020)
Chrisman, M., Nothwehr, R., Yang, G., Oleson, J.: Environmental influences on physical activity in rural midwestern adults: a qualitative approach. Health Promot. Pract. 16(1), 142–148 (2015). https://doi.org/10.1177/1524839914524958
Furst, J., Raicu, D.S., Jason, L.A.: Data mining. In: Jason, L.A., Glenwick, D.S. (eds.) Handbook of Methodological Approaches to Community-Based Research: Qualitative, Quantitative, and Mixed Methods, pp. 187–196. Oxford University Press, New York (2016). https://doi.org/10.1093/med:psych/9780190243654.003.0019
Ghosh, D., Guha, R.: What are we ‘tweeting’ about obesity? map** tweets with topic modeling and geographic information system. Cartogr. Geogr. Info. Sci. 40(2), 90–102 (2013). https://doi.org/10.1080/15230406.2013.776210
Hawn, C.: Take two aspirin and tweet me in the morning: how twitter, facebook, and other social media are resha** health care. Health Aff. (Millwood) 28(2), 361–368 (2009). https://doi.org/10.1377/hlthaff.28.2.36
Hou, D., Song, X., Zhang, G., Loaiciga, H.: An early warning and control system for urban, drinking water quality protection: China’s experience. Environ. Sci. Pollut. Res. 20, 4496–4508 (2013). https://doi.org/10.1007/s11356-012-1406-y
Institute of Medicine (US): The Future of Public Health. National Academics Press (US), Washington (DC) (1988)
Lai, Y., Stone, D.J.: Data integration for urban health. In: Celi, L.A., Majumder, M.S., Ordóñez, P., Osorio, J.S., Paik, K.E., Somai, M. (eds.) Leveraging data science for global health, pp. 351–363. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-47994-7_21
Meit, M.: Exploring Strategies to Improved Health and Equity in Rural Communities. The Walsh Center for Rural Health Analysis, Chicago (2018)
Ransome, Y., Luan, H., Shi, X., Duncan, D.T., Subramanian, S.V.: Alcohol outlet density and area-level heavy drinking are independent risk factors for higher alcohol-related complaints. J. Urban Health 96(6), 889–901 (2018). https://doi.org/10.1007/s11524-018-00327-z
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna (2020)
Schneider, P., Castell, N., Vogt, M., Dauge, F.R., Lahoz, W.A., Bartonova, A.: Map** urban air quality in near real-time using observations from low-cost sensors and model information. Environ. Int. 106, 234–247 (2017). https://doi.org/10.1016/j.envint.2017.05.005
Seidahmed, O.M., Lu, D., Chong, C.S., Ng, L.C., Eltahir, E.A.: Patterns of urban housing shape dengue distribution in Singapore at neighborhood and country scales. GeoHealth 2(1), 54–67 (2018). https://doi.org/10.1002/2017GH000080
Shirzad, E., Ataei, G., Saadatfar, H.: Applications of data mining in healthcare area: a survey. Eng. Appl. Sci. Res. 48(3), 314–323 (2021). https://doi.org/10.14456/easr.2021.34
Thongkam, J., Sukmak, V., Mayusiri, W.: A comparison of regression analysis for predicting the daily number of anxiety-related outpatient visits with different time series data mining. Eng. Appl. Sci. Res. 42(3), 243–249 (2015). https://doi.org/10.14456/kkuenj.2015.26
Young, R., Willis, E., Stemmle, J., Rodgers, S.: Localized health news releases and community newspapers: a method for rural health promotion. Health Promot. Pract. 16(4), 492–500 (2015). https://doi.org/10.1177/1524839915580538
Wold, C.: In plain sight: Is open data improving our health. California Healthcare Foundation, California (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Conflict of Interest
Conflict of interest declaration: neither author declares a conflict of interest.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Akoto, D., Akoto, R.N.A. (2023). Public Health Predictive Analysis of Chicago Community Areas: A Data Mining Approach. In: Aigbavboa, C., et al. Sustainable Education and Development – Sustainable Industrialization and Innovation. ARCA 2022. Springer, Cham. https://doi.org/10.1007/978-3-031-25998-2_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-25998-2_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25997-5
Online ISBN: 978-3-031-25998-2
eBook Packages: Business and ManagementBusiness and Management (R0)