Log in

A spatio-temporal model for binary data and its application in analyzing the direction of COVID-19 spread

  • Original Paper
  • Published:
AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Abstract

It is often of primary interest to analyze and forecast the levels of a continuous phenomenon as a categorical variable. In this paper, we propose a new spatio-temporal model to deal with this problem in a binary setting, with an interesting application related to the COVID-19 pandemic, a phenomena that depends on both spatial proximity and temporal auto-correlation. Our model is defined through a hierarchical structure for the latent variable, which corresponds to the probit-link function. The mean of the latent variable in the proposed model is designed to capture the trend and the seasonal pattern as well as the lagged effects of relevant regressors. The covariance structure of the model is defined as an additive combination of a zero-mean spatio-temporally correlated process and a white noise process. The parameters associated with the space-time process enable us to analyze the effect of proximity of two points with respect to space or time and its influence on the overall process. For estimation and prediction, we adopt a complete Bayesian framework along with suitable prior specifications and utilize the concepts of Gibbs sampling. Using the county-level data from the state of New York, we show that the proposed methodology provides superior performance than benchmark techniques. We also use our model to devise a novel mechanism for predictive clustering which can be leveraged to develop localized policies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

Data used in the main analysis are extracted from the the COVID-19 GitHub repository maintained by the Center for Systems Science and Engineering at Johns Hopkins University (link: https://github.com/CSSEGISandData/COVID-19). The code for the spatio-temporal model for binary data discussed in this paper, along with the pre-processed and cleaned data, is available at the GitHub repository (link: https://github.com/anaghchattopadhyay/Spatio-temporal-model-for-binary-data), maintained by the first author.

References

  • Albert, J.H., Chib, S.: Bayesian analysis of binary and polychotomous response data. J. Am. Stat. Assoc. 88(422), 669–679 (1993)

    Article  MathSciNet  Google Scholar 

  • Anastassopoulou, C., Russo, L., Tsakris, A., Siettos, C.: Data-based analysis, modelling and forecasting of the COVID-19 outbreak. PLoS ONE 15(3), e0230405 (2020)

    Article  Google Scholar 

  • Anselin, L.: Spatial Econometrics: Methods and Models, vol. 4. Springer Science & Business Media, Cham (1988)

    Book  Google Scholar 

  • Asahi, K., Undurraga, E.A., Valdés, R., Wagner, R.: The effect of COVID-19 on the economy: evidence from an early adopter of localized lockdowns. J. Glob. Health 11, 05002 (2021)

    Article  Google Scholar 

  • Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Ser. B Stat Methodol. 70(4), 825–848 (2008)

    Article  MathSciNet  Google Scholar 

  • Barría-Sandoval, C., Ferreira, G., Benz-Parra, K., López-Flores, P.: Prediction of confirmed cases of and deaths caused by COVID-19 in Chile through time series techniques: a comparative study. PLoS ONE 16(4), e0245414 (2021)

    Article  Google Scholar 

  • Beloconi, A., Probst-Hensch, N.M., Vounatsou, P.: Spatio-temporal modelling of changes in air pollution exposure associated to the COVID-19 lockdown measures across Europe. Sci. Total Environ. 787, 147607 (2021)

    Article  Google Scholar 

  • Berrett, C.: Bayesian Probit Regression Models for Spatially-Dependent Categorical Data. Ph. D. thesis, The Ohio State University (2010)

  • Bivand, R.: R packages for analyzing spatial data: a comparative case study with areal data. Geogr. Anal. 54(3), 488–518 (2022)

    Article  Google Scholar 

  • Chandra, R., Jain, A., Singh Chauhan, D.: Deep learning via LSTM models for COVID-19 infection forecasting in India. PLoS ONE 17(1), e0262708 (2022)

    Article  Google Scholar 

  • Chatterjee, S., Anton, J.M., Rosengart, T.K., Coselli, J.S.: Cardiac surgery during the COVID-19 sine wave: preparation once, preparation twice. A view from Houston. J. Cardiac. Surg. 36(5), 1615–1623 (2021)

    Article  Google Scholar 

  • Cheng, T., Zhao, Y., Zhao, C.: Exploring the spatio-temporal evolution of economic resilience in Chinese cities during the COVID-19 crisis. Sustain. Cities Soc. 84, 103997 (2022)

    Article  Google Scholar 

  • Chib, S.: Modeling and analysis for categorical response data. Handb. Stat. 25, 835–867 (2005)

    Article  MathSciNet  Google Scholar 

  • Chowdhury, M.E.H., Rahman, T., Khandakar, A., Mazhar, R., Kadir, M.A., Mahbub, Z.B., Islam, K.R., Khan, M.S., Iqbal, A., Emadi, N.A., Reaz, M.B.I., Islam, M.T.: Can AI help in screening viral and COVID-19 pneumonia? IEEE Access 8, 132665–132676 (2020). https://doi.org/10.1109/ACCESS.2020.3010287

    Article  Google Scholar 

  • Christensen, O.F., Waagepetersen, R.: Bayesian prediction of spatial count data using generalized linear mixed models. Biometrics 58(2), 280–286 (2002)

    Article  MathSciNet  Google Scholar 

  • Congdon, P.: Bayesian Models for Categorical Data. Wiley, Hoboken (2005)

    Book  Google Scholar 

  • Czado, C., Gneiting, T., Held, L.: Predictive model assessment for count data. Biometrics 65(4), 1254–1261 (2009)

    Article  MathSciNet  Google Scholar 

  • Deb, S., Dey, D.: Spatial modeling of shot conversion in soccer to single out goalscoring ability. J. Sports Anal. 5(4), 281–297 (2019)

    Article  Google Scholar 

  • Diggle, P.J., Tawn, J.A., Moyeed, R.A.: Model-based geostatistics. J. R. Stat. Soc. Ser. C 47(3), 299–350 (1998)

    Article  MathSciNet  Google Scholar 

  • Dixon, P.M.: Ripley’s K function. Encycl. Environ. 3, 1796–1803 (2002)

    Google Scholar 

  • Dong, Z., Zhu, S., **e, Y., Mateu, J., Rodríguez-Cortés, F.J.: Non-stationary spatio-temporal point process modeling for high-resolution COVID-19 data. J. R. Stat. Soc. Ser. C Appl. Stat. 72(2), 368–386 (2023)

    Article  MathSciNet  Google Scholar 

  • Dormann, C.F., McPherson, J.M., Araújo, M.B., Bivand, R., Bolliger, J., Carl, G., Davies, R.G., Hirzel, A., Jetz, W., Kissling, W.D., et al.: Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography 30(5), 609–628 (2007)

    Article  Google Scholar 

  • Faíco-Filho, K.S., Passarelli, V.C., Bellei, N.: Is higher viral load in SARS-CoV-2 associated with death? Am. J. Trop. Med. Hyg. 103(5), 2019 (2020)

    Article  Google Scholar 

  • Franzese, R.J., Hays, J.C., Cook, S.J.: Spatial-and spatiotemporal-autoregressive probit models of interdependent binary outcomes. Polit. Sci. Res. Methods 4(1), 151–173 (2016)

    Article  Google Scholar 

  • Fritz, C., Dorigatti, E., Rügamer, D.: Combining graph neural networks and spatio-temporal disease models to improve the prediction of weekly COVID-19 cases in Germany. Sci. Rep. 12(1), 3930 (2022)

    Article  Google Scholar 

  • Fuglstad, G.A., Simpson, D., Lindgren, F., Rue, H.: Constructing priors that penalize the complexity of Gaussian random fields. J. Am. Stat. Assoc. 114(525), 445–452 (2019)

    Article  MathSciNet  Google Scholar 

  • Gao, M., Yang, H., **ao, Q., Goh, M.: COVID-19 lockdowns and air quality: evidence from grey spatiotemporal forecasts. Socioecon. Plann. Sci. 83, 101228 (2022)

    Article  Google Scholar 

  • Gayawan, E., Adjei, C.N.: Bayesian spatio-temporal analysis of breastfeeding practices in Ghana. GeoJournal 86(4), 1943–1955 (2021)

    Article  Google Scholar 

  • Gayawan, E., Awe, O.O., Oseni, B.M., Uzochukwu, I.C., Adekunle, A., Samuel, G., Eisen, D.P., Adegboye, O.A.: The spatio-temporal epidemic dynamics of COVID-19 outbreak in Africa. Epidemiol. Infect. 148, e212 (2020)

    Article  Google Scholar 

  • Gelman, A.: Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper). Bayesian Anal. 1(3), 515–534 (2006)

    Article  MathSciNet  Google Scholar 

  • Gelman, A., Rubin, D.B.: Inference from iterative simulation using multiple sequences. Stat. Sci. 7(4), 457–472 (1992)

    Article  Google Scholar 

  • Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. J. Appl. Stat. 20(5–6), 25–62 (1993)

    Article  Google Scholar 

  • Guadamuz, R., Aguero-Valverde, J.: Bayesian spatial models of injury severity at railway crossings. J. Transp. Saf. Sec. 13(6), 680–693 (2021)

    Google Scholar 

  • Guhathakurata, S., Kundu, S., Chakraborty, A., Banerjee, J.S.: A novel approach to predict COVID-19 using support vector machine, Data Science for COVID-19, pp. 351–364. Elsevier (2021)

  • Guliyev, H.: Determining the spatial effects of COVID-19 using the spatial panel data model. Spat. Stat. 38, 100443 (2020)

    Article  MathSciNet  Google Scholar 

  • Hardouin, C., Cressie, N.: Two-scale spatial models for binary data. Stat. Methods Appl. 27(1), 1–24 (2018)

    Article  MathSciNet  Google Scholar 

  • Heaton, M.J.: Kernel averaged predictors for space and space-time processes. Ph. D. thesis, Duke University (2011)

  • Heneghan, C.J., Jefferson, T.: Why COVID-19 modelling of progression and prevention fails to translate to the real-world. Adv. Biol. Regul. 86, 100914 (2022)

    Article  Google Scholar 

  • Hyndman, R.J., Athanasopoulos, G.: Forecasting: principles and practice. OTexts. https://otexts.com/fpp3/ (2018)

  • Imtyaz, A., Haleem, A., Javaid, M.: Analysing governmental response to the COVID-19 pandemic. J. Oral Biol. Craniofac. Res. 10(4), 504–513 (2020)

    Article  Google Scholar 

  • Ioannidis, J.P., Cripps, S., Tanner, M.A.: Forecasting for COVID-19 has failed. Int. J. Forecast. 38(2), 423–438 (2022)

    Article  Google Scholar 

  • Johnson, D.: Spatial autocorrelation, spatial modeling, and improvements in grasshopper survey methodology. Can. Entomol. 121(7), 579–588 (1989)

    Article  Google Scholar 

  • Kammann, E., Wand, M.P.: Geoadditive models. J. Roy. Stat. Soc. Ser. C Appl. Stat. 52(1), 1–18 (2003)

    Article  MathSciNet  Google Scholar 

  • Kaufman, L., Rousseeuw, P.J.: Partitioning Around Medoids (Program PAM), Chapter 2, In: Kaufman, L., Rousseeuw, P.J., (eds.) Finding Groups in Data. Wiley, pp. 68–125. https://doi.org/10.1002/9780470316801.ch2 (1990)

  • Kelejian, H.H., Prucha, I.R.: A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J. Real Estate Financ. Econ. 17, 99–121 (1998)

    Article  Google Scholar 

  • Kianfar, N., Mesgari, M.S., Mollalo, A., Kaveh, M.: Spatio-temporal modeling of COVID-19 prevalence and mortality using artificial neural network algorithms. Spat. Spat.-Tempor. Epidemiol. 40, 100471 (2022)

    Article  Google Scholar 

  • Klobucista, C.: By How Much Are Countries Underreporting COVID-19 Cases and Deaths? Council on Foreign Relations, 2021. JSTOR. http://www.jstor.org/stable/resrep33364. Accessed 7 July 2024 (2021)

  • Kolassa, S.: Evaluating predictive count data distributions in retail sales forecasting. Int. J. Forecast. 32(3), 788–803 (2016)

    Article  Google Scholar 

  • Lee, D.: A comparison of conditional autoregressive models used in Bayesian disease map**. Spat. Spat.-Temporal Epidemiol. 2(2), 79–89 (2011)

    Article  Google Scholar 

  • Lee, D.: CARBayes: an R package for Bayesian spatial modeling with conditional autoregressive priors. J. Stat. Softw. 55(13), 1–24 (2013)

    Article  Google Scholar 

  • Leroux, B.G., Lei,X., Breslow,N.: Estimation of Disease Rates in Small Areas: a New Mixed Model for Spatial Dependence. In Statistical Models in Epidemiology, the Environment, and Clinical Trials, pp. 179–191. Springer (2000).

  • Li, Y., Undurraga, E.A., Zubizarreta, J.R.: Effectiveness of localized lockdowns in the COVID-19 pandemic. Am. J. Epidemiol. 191(5), 812–824 (2022)

    Article  Google Scholar 

  • Lowe, R., Bailey, T.C., Stephenson, D.B., Graham, R.J., Coelho, C.A., Carvalho, M.S., Barcellos, C.: Spatio-temporal modelling of climate-sensitive disease risk: towards an early warning system for dengue in Brazil. Comput. Geosci. 37(3), 371–381 (2011)

    Article  Google Scholar 

  • Lütkepohl, H., Xu, F.: The role of the log transformation in forecasting economic variables. Empir. Econ. 42(3), 619–638 (2012)

    Article  Google Scholar 

  • Maranzano, P., Otto, P., Fassò, A.: Adaptive lasso estimation for functional hidden dynamic geostatistical models. Stoch. Env. Res. Risk Assess. 37(9), 3615–3637 (2023)

    Article  Google Scholar 

  • Martinetti, D., Geniaux, G.: Approximate likelihood estimation of spatial probit models. Reg. Sci. Urban Econ. 64, 30–45 (2017)

    Article  Google Scholar 

  • Mateu, J., Giraldo, R.: Geostatistical Functional Data Analysis. Wiley, Hoboken (2021)

    Google Scholar 

  • McCullagh, P.: Generalized Linear Models. Routledge, UK (2019)

    Book  Google Scholar 

  • Meyer, D., Dimitriadou,E., Hornik,K., Weingessel,A., Leisch,F.: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien. R package version 1.7-13,(2023)

  • Minniakhmetov, I., Dimitrakopoulos, R.: High-order data-driven spatial simulation of categorical variables. Math. Geosci. 54(1), 23–45 (2022)

    Article  MathSciNet  Google Scholar 

  • Mira, A., Tierney, L.: Efficiency and convergence properties of slice samplers. Scand. J. Stat. 29(1), 1–12 (2002)

    Article  MathSciNet  Google Scholar 

  • National Center for Immunization and Respiratory Diseases. Science Brief: Indicators for Monitoring COVID-19 Community Levels and Making Public Health Recommendations, CDC COVID-19 Science Briefs [Internet]. Centers for Disease Control and Prevention (US). Updated 2022 Aug 11(2022)

  • Nazia, N., Butt, Z.A., Bedard, M.L., Tang, W.C., Sehar, H., Law, J.: Methods used in the spatial and spatiotemporal analysis of COVID-19 epidemiology: a systematic review. Int. J. Environ. Res. Public Health 19(14), 8267 (2022)

    Article  Google Scholar 

  • Neal, R.M.: Slice sampling. Ann. Stat. 31(3), 705–767 (2003)

    Article  MathSciNet  Google Scholar 

  • Nikparvar, B., Rahman, M.M., Hatami, F., Thill, J.C.: Spatio-temporal prediction of the COVID-19 pandemic in US counties: modeling with a deep LSTM neural network. Sci. Rep. 11(1), 21715 (2021)

    Article  Google Scholar 

  • Odagaki, T.: Self-organized wavy infection curve of COVID-19. Sci. Rep. 11(1), 1–7 (2021)

    Google Scholar 

  • Paradinas, I., Conesa, D., López-Quílez, A., Bellido, J.M.: Spatio-temporal model structures with shared components for semi-continuous species distribution modelling. Spat. Stat. 22, 434–450 (2017)

    Article  MathSciNet  Google Scholar 

  • Pathak, R., Williams, D.: Evaluating the comparative accuracy of COVID-19 mortality forecasts: an analysis of the first-wave mortality forecasts in the United States. Forecasting 4(4), 798–818 (2022)

    Article  Google Scholar 

  • Pu, M., Zhong, Y.: Rising concerns over agricultural production as COVID-19 spreads: lessons from China. Glob. Food Sec. 26, 100409 (2020)

    Article  Google Scholar 

  • Puhach, O., Meyer, B., Eckerle, I.: SARS-CoV-2 viral load and shedding kinetics. Nat. Rev. Microbiol. 21(3), 147–161 (2023)

    Google Scholar 

  • Rawat, S., Deb, S.: A spatio-temporal statistical model to analyze COVID-19 spread in the USA. J. Appl. Stat. 50(11–12), 2310–2329 (2023)

    Article  MathSciNet  Google Scholar 

  • Roberts, G.O., Rosenthal, J.S.: Convergence of slice sampler Markov chains. J. R. Stat. Soc. Ser. B Stat. Methodol. 61(3), 643–660 (1999)

    Article  MathSciNet  Google Scholar 

  • Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  Google Scholar 

  • Sauerheber, R.: Characteristics of the Covid-19 Pandemic in the United States, 2020. Arch. Prevent. Med. 5(1), 058–063 (2020)

    Google Scholar 

  • Schmidt, A.M., Nobre, W.S.: Conditional Autoregressive (CAT) Model, 1–11. Wiley StatsRef, Statistics Reference Online (2014)

  • Schubert, E., Rousseeuw,P.J.: Faster k-medoids clustering: improving the PAM, CLARA, and CLARANS algorithms. In: International conference on similarity search and applications, Springer, PP. 171–187, (2019).

  • Shafiq, A., Çolak, A.B., Sindhu, T.N., Lone, S.A., Alsubie, A., Jarad, F.: Comparative study of artificial neural network versus parametric method in COVID-19 data analysis. Results Phys. 38, 105613 (2022)

    Article  Google Scholar 

  • Smith, T.E., LeSage, J.P.: A Bayesian probit model with spatial dependencies. Emerald Group Publishing Limited, Spatial and spatiotemporal econometrics (2004)

  • Steinwart, I., Christmann, A.: Support Vector Machines. Springer Science & Business Media, Cham (2008)

    Google Scholar 

  • Ter Braak, C.J.: A Markov Chain Monte Carlo version of the genetic algorithm differential evolution: easy Bayesian computing for real parameter spaces. Stat. Comput. 16(3), 239–249 (2006)

    Article  MathSciNet  Google Scholar 

  • Tiefelsdorf, M., Griffith, D.A., Boots, B.: A variance-stabilizing coding scheme for spatial link matrices. Environ. Plan A 31(1), 165–180 (1999)

    Article  Google Scholar 

  • Wang, Y., Finazzi, F., Fassò, A.: D-STEM v2: a software for modeling functional spatio-temporal data. J. Stat. Softw. 99, 1–29 (2021)

    Article  Google Scholar 

  • Yang, R., Ren, F., Xu, W., Ma, X., Zhang, H., He, W.: China’s ecosystem service value in 1992–2018: pattern and anthropogenic driving factors detection using Bayesian spatiotemporal hierarchy model. J. Environ. Manag. 302, 114089 (2022)

    Article  Google Scholar 

  • Zhou, Y., Levy, J.I.: Factors influencing the spatial extent of mobile source air pollution impacts: a meta-analysis. BMC Public Health 7(1), 1–11 (2007)

    Article  Google Scholar 

  • Zhu, J., Huang, H.C., Wu, J.: Modeling spatial-temporal binary data using Markov random fields. J. Agric. Biol. Environ. Stat. 10(2), 212–225 (2005)

    Article  Google Scholar 

  • Zhu, S., Bukharin, A., **e, L., Santillana, M., Yang, S., **e, Y.: High-resolution spatio-temporal model for county-level COVID-19 activity in the US. ACM Trans. Manag. Inform. Syst. (TMIS) 12(4), 1–20 (2021)

    Article  Google Scholar 

Download references

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soudeep Deb.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chattopadhyay, A., Deb, S. A spatio-temporal model for binary data and its application in analyzing the direction of COVID-19 spread. AStA Adv Stat Anal (2024). https://doi.org/10.1007/s10182-024-00507-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10182-024-00507-0

Keywords

Navigation