1 Introduction

In this era of big data, Analysts use the data and assist the government in making decisions as to the availability of government-provided health, agricultural, energy, tourism, and economic data [1] has increased [2]. The Internet of Things revolves around real-time decision making; it’s all about decision informatics and embraces Big Data’s advanced technology [3]. Many methods are used to analyze, optimize, and forecast large data [4]. COVID-19 creates a global emergency that affects many aspects of life, including health, economics, and politics [5]. It’s an infectious disease caused by a coronavirus. Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people and people with underlying medical problems such as cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illnesses. Although the COVID-19 has low mortality rates as compared to severe acute respiratory syndrome (SARS) and the Middle East respiratory syndrome (MERS), this virus has a higher transmissibility [6]. Considering that COVID-19 trends are unknown and its end is also uncertain. Where specific information and computing resources are available, mechanical models can be sufficient to predict coronavirus infection patterns and more accurately model the impact of different intervention strategies to inform decision-makers and health care workers [7]. Many studies deal in the next trends of the epidemic in many regions. Like in Nigeria [8], IRAN [9], USA [10] and [11]. These papers used traditional methods to forecast the pandemic’s future behavior, such as traditional ARIMA and regression models. These models face the problem of over-adjustment, especially where covariates are present [12]. Time series evaluation is regularly used to do calls for forecasting, which calls for information on seasonality and trend, in addition to regression components. For small data, the researcher wishes to estimate those additives with proper precision is hard the usage of traditional time series methods.

Bayesian structural time series (BSTS) models are a viable option due to [13] having a number of intriguing features. These models can accommodate a large number of covariates and properly reflect stochastic behavior by allowing model parameters to fluctuate over time [14, 15]. The Bayesian technique has the advantage of placing previous beliefs on the parameters, which is an advantage over the classical approach. This method is extra evident than ARIMA models and offers uncertainty in an extra fashionable manner. It is greater obvious due to the fact that its illustration does now no longer depend upon differencing, lags, and shifting averages. These models could be useful in setting priorities in public health, growing, and imposing regulations to address and keep away from the negative health situations [16]. These models have already been used to predict the fitness consequences of alcohol consumption. [17] and to predict the negative effects on health and rate of crime as a result of local alcoholic licensing regulations [18]. It is likewise feasible to pick out the proper variables through Spike and Slab priors by the use of those models [19]. The BSTS models predicted future health consequences from alcohol consumption better than ARIMA. Despite the fact that the training dataset only contains eight data points, they were able to construct a reasonably accurate 1-5 year estimate [17].

BSTS models, in a nutshell, are stochastic state-space models that may examine trend, seasonality, and regression components individually. The spike and slab priors are utilized in these models to choose appropriate covariates, The final projections are produced using Bayesian model averaging. The estimates from these models are the least dependent on particular assumptions.Using these models, the model parameters are manually weighted depending on their inclusion probabilities. Linear regression component of the model is not required in these models since they select the most informative parameters. These models improve the depiction of estimate certainty and change over time [17]. Analytically computing the Bayesian posterior distribution is, however, rather complex. As a result, mathematical calculations have been done by utilizing Markov Chain Monte Carlo (MCMC) methods like Gibbs sampling [20]. The Bayesian structural time series (BSTS) models [21] are implemented with the recently built bsts package of R.

In response to the severe COVID-19 epidemic, global vaccine development efforts have been increased. Even with minimal protection against infection, vaccination can have a significant influence on preventing COVID-19 outbreaks in United states [22]. The evaluation of vaccination’s effects in the target countries could be highly fascinating. Intervention analysis can be used to investigate the causal impact of vaccination in these nations. Causal impact analysis employing bayesian structural time series models is accustomed to assessing the effect of the involvement in the post-involvement phase. These models, in contrast to traditional ARIMA models, give absolute effect and relative effect and performs better than conventional models due to their chronological method, as well as additional benefits such as past data and complex covariate structure [23].

The study’s first goal is to develop BSTS models for researching COVID-19’s future trends and compare their prediction power to that of the most commonly utilized ARIMA models. This study aims to look into the COVID-19’s temporal dynamics in the five countries affected worldwide. The United States, the United Kingdom, the United Arab Emirates, Bahrain, and India are among these countries. We also looked into the vaccination’s causal effects in these nations. We explored BSTS models and intervention analysis using bayesian structural time series models to attain this goal. When contrasted to ARIMA models, the outcomes showed a higher level of accuracy. The methods suggested can be used to examine these trends in any other country.

2 Methodology

Data from Our World in Data included the total number of positive cases, the cumulative number of deaths, and the cumulative number of vaccines given on the website named as Humanitarian Data Exchange. These authorities produce and update statistics for many nations on a regular schedule. The released information has since been accepted for examination, and the study no longer requires any permission from authorities.

A time series intervention analysis can be conducted using the BSTS designs. The discrepancy between the actual and predicted time series that had happend if the treatment had not occurred can be calculated using these methods. The following steps can be used to examine the causal effects of vaccination using these computations. The BSTS model is estimated in the first stage utilizing data up to the goal date (date of vaccination started, Here we consider February 15 as the date of vaccination started for five countries). An estimated model is used to forecast the vaccination period if the intervention is not used (without vaccination) in the next stage. Finally, during the vaccination period, To determine the causal influence of vaccination, the difference among expected and actual data is evaluated.

The BSTS models were used to generate forecasts for various pandemic parameters. The Bayesian technique is incorporated into these models. The likelihood function (current data) is blended with prior information (like a professional viewpoint) to upgrade the existing information and construct the finalized Bayesian models, known as posterior distributions, employing these models. These models employ Bayesian model averaging and Kalman filtering to generate more exact forecasts [16]. When using these models, closed-form estimators for model parameters are not achievable due to their complexity [21]. We used the R language to estimate the model parameters quantitatively using the Markov Chain Monte Carlo (MCMC) approach. The MCMC approaches use conditional distributions to draw random samples for the model parameters and then average the results to get the final estimations. Due to the complexity of these models, closed-form estimators for model parameters are not available when employing them [21]. We used the R language to estimate the model parameters quantitatively using the Markov Chain Monte Carlo (MCMC) sampling method. The MCMC method uses conditional distributions to draw samples at random for estimating BSTS parameters and then average the results to get the ultimate estimations. The Ljung Box test has been used to perform diagnostic checks on the models. Various forecast accuracy measurements, like root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE), have been used to compare the forecast accuracy of BSTS models to the most often used ARIMA models. We have only given projections for different parameters of the pandemic using BSTS models because the enhanced forecast accuracy has been noticed. To do this, I treated the effect of vaccinations (in the aforementioned countries) as an intervention and conducted an intervention analysis using BSTS models. The numerical findings for the causal impacts were acquired using the R package CausalImpact.

3 Bayesian Structural Time-series Models

The time series is broken down into four factors in the basic structural model: a level, a local trend, seasonal impacts, and an error term. A pair of equations can be used to define a structural time series model [24] given by

$$\begin{aligned} y_{t}= & {} Z_{t}^{\mathrm {T}} \alpha _{t}+\varepsilon _{t} \end{aligned}$$
(1)
$$\begin{aligned} \alpha _{t+1}= & {} T_{t} \alpha _{t}+R_{t} \eta _{t}, \end{aligned}$$
(2)

The observation equation is the first (1), and the state equation is the second (2); it ties the observed data(\(y_{t}\)) to the state vector( \(\alpha _{t}\)). where \(\varepsilon _{t} \sim \mathcal {N}\left( 0, \sigma _{t}^{2}\right) \) and \(\eta _{t} \sim \mathcal {N}\left( 0, Q_{t}\right) \) are independent of all other unknowns and \(\varepsilon _{t}\) and \(\eta _{t}\) are observation error and system error respectively. The output vector, transition matrix, control matrix, and state-diffusion matrix are represented by \(Z_t\), \(T_t\), \(R_t\), and \(Q_t\), respectively.

3.1 Local Level Model

Local level model is the simplest Structural times series model. Local level assumes the trend is a random walk:

$$\begin{aligned} y_{t}= & {} \alpha _{t}+\epsilon _{t} \quad \epsilon _{t} \sim N\left( 0, \sigma _{\epsilon }^{2}\right) \end{aligned}$$
(3)
$$\begin{aligned} \alpha _{t+1}= & {} \alpha _{t}+\eta _{t} \quad \eta _{t} \sim N\left( 0, \sigma _{\eta }^{2}\right) \end{aligned}$$
(4)

In the local level the matrices \(Z_{t}\), \(T_{t}\) and \(R_{t}\) in equation are collapsed to the scalar value ‘1’. Parameters of the model are variances of the error term \(\left( \sigma _{\epsilon }^{2}, \sigma _{\eta }^{2}\right) \).

3.2 Local Linear Trend Model

The local linear trend assumes that both the mean and slope follow random walks. The equation for the mean is as follows:

$$\begin{aligned} y_{t}= & {} \mu _{t}+\epsilon _{t} \quad \epsilon _{t} \sim N\left( 0, \sigma _{\epsilon }^{2}\right) \end{aligned}$$
(5)
$$\begin{aligned} \mu _{t+1}= & {} \mu _{t}+\delta _{t}+\eta _{t} \quad \eta _{t} \sim N\left( 0, \sigma _{\eta }^{2}\right) \end{aligned}$$
(6)

and the equation of the slope is:

$$\begin{aligned} \delta _{t+1}=\delta _{t}+\zeta _{t} \quad \zeta _{t} \sim N\left( 0, \sigma _{\zeta }^{2}\right) \end{aligned}$$
(7)

Because it quickly adapts to local variability, the local linear trend model is a common choice for modeling trends. This is useful when making short-term forecasts. When making longer-term forecasts, this kind of flexibility may be undesirable, as such predictions frequently have implausibly large uncertainty intervals.

4 ARIMA Models

Three parameters determine the \({\text {ARIMA}}(p, d, q)\) model. The parameter p in the \({\text {AR}}(p)\) represents the current values depend on its own p-previous values and parameter q in the \({\text {MA}}(q)\) represents The current deviation from mean depends on q- previous deviations. And \(\mathrm {d}\) is the measure of difference. The \({\text {ARIMA}}(p, d, q)\) model has the form [25] :

$$\begin{aligned} (1-u)^{d} Y_{t}&=\alpha +\phi _{1} Y_{t-1}+\phi _{2} Y_{t-2}+\cdots +\phi _{p} Y_{t-p}+\varepsilon _{t}+ \theta _{1} \varepsilon _{t-1}+\theta _{2}\\&\varepsilon _{t-2}+\cdots +\theta _{q} \varepsilon _{t-q}+\varepsilon _{t}+\theta _{1} \varepsilon _{t-1}+\theta _{2} \varepsilon _{t-2}+ \cdots +\theta _{q} \varepsilon _{t-q} \end{aligned}$$

where u is the time-shift operator \(u^{d} Y_{t}=Y_{t-d}\). The total number of reported daily Covid-19 cases was \(Y_t\), with the first difference \(\Delta Y_t=Y_t-Y_{t-l}\) representing the daily number of infections. The Akaike Information Criterion (AIC) was used to determine the final parameters for d and pq. The open-source software R was used to perform all calculations.

5 Results and Discussions

We performed the research to find out COVID-19’s future behavior in the five nations afflicted by the virus, as well as the causative influence of vaccination in these countries. 30th of June, 2021, the United States of America had confirmed 33,664,970 cases, 604,714 fatalities, and 326,521,526 vaccinations. UK had 4,817,298 positive cases, 128,404 deaths, and 77,592,212 vaccinations. In terms of COVID-19 instances, UAE had 632,907 confirmed cases, 1,811 deaths, and 15,362,342 vaccinations. And Bahrain had 265827 confirmed cases, 1,352 fatalities, and 1,057,888 vaccinations. With 30,411,634 positive cases, 399,459 deaths, and 271,410,023 immunizations, India is facing a major outbreak. With these data in hand, We set out to evaluate the predicting accuracy of the recommended BSTS models to the much more often utilized ARIMA models. This contrast has been made using various forecasting accuracy measures like RMSE, MAE, and MAPE. The accuracy of predicting results comparison is shown in Table 1. Table 2 and Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 provide the projections for the following sixty days, including the projected number of cases, deaths, and total vaccinations. Finally, the causal effects of vaccines are summarised in Table 3 and Figs. 16, 17, 18, 19, 20, 21, 22, 23, 24, 25. Table 1 shows the contrast of BSTS and ARIMA models by measures of several prediction accuracy parameters. The BSTS models delivered more accurate forecasts than the ARIMA models, according to these findings. There are a few deviations, which could be related to the data’s unpredictable behavior. As a result, for projecting COVID-19 trends in these nations, BSTS models have proven to be a suitable option for ARIMA models. As a result, only the BSTS models’ forecasts have been presented in full.

Table 1 Measures for forecast accuracy for BSTS and ARIMA models

Table 2 presents two-month projections for the five countries involved. On August 30, 2021, the cumulative number of positive cases, deaths, and population that has received at least one dose of vaccine in the United States was predicted to grow by 1.17 percent, 2.7 percent, and 12.5 percent, respectively. In the United States, 54 percent of the population has received at least one dose of the COVID-19 vaccine, and 46.61 percent is fully vaccinated as of June 30. Our projection is that nearly 66.5 percent of the population will have had at least one dose by August 30, and the lesser the number of deaths and cases in the United States. If people become fully vaccinated and take precautions, the coronavirus could be brought under control in the United States in the coming months. The predicted rise in the cumulative number of positive cases, deaths, and a population that has received at least one dose of vaccine in the UK is 19.4%, 3.5%, and 17%, respectively. In the United Kingdom, 66% of the population has received at least one dose of the COVID-19 vaccine, 48% is fully vaccinated as of June 30, and we expect approximately 83% to have had at least one dose by August 30. The increased vaccination rate aids the UK in controlling daily COVID-19 deaths, but due to a lack of sufficient measures, the UK continues to have a substantial number of daily cases.

Similarly, the cumulative number of positive cases, deaths, and vaccinations in the UAE is predicted to grow by 15.5%, 15.8%, and 25.2%, respectively. On June 30, 74% of the population in the UAE had received at least one dose of the COVID-19 vaccine, and 64% were fully vaccinated. These rapid vaccinations in the UAE keep daily deaths to a bare minimum and daily cases under control.

In Bahrain, the overall number of positive cases, deaths, and the population that has received at least one dose of immunizations are expected to increase by 13.8%, 9.4%, and 14%, respectively. As of June 30, 62.17 percent of the Bahrain population had received at least one dose of the COVID-19 vaccine, 58% had been fully vaccinated, and we predict that by August 30, about 76% of the population will have had at least one dose. If Bahrain accelerates its immunization program, it will be able to reduce COVID-19-related deaths and cases on a daily basis.

In India, the overall positive cases, deaths, and population that has received at least one dose of vaccines are expected to rise by 8%, 14.8%, and 15.3%, respectively. As of June 30, 15.48 percent of India’s population had gotten at least one dose of the COVID-19 vaccine, while 19.66 percent had been fully vaccinated, and our prediction is that by August 30, about 31% of the population will have had at least one dose.

The next stage was to look into the role of vaccines in the development of cumulative cases and deaths in the five countries concerned. It should be noted that the vaccination in the United Kingdom began on December 8, 2020, in the United States on December 14, 2020, in UAE on December 14, 2020, in Bahrain on December 22, 2020, and in India on 16, January 2021. The immunization date (February 15, 2021) was employed as the intervention in the BSTS models’ intervention analysis. We compared the current figures to what might have happened if these countries had not vaccinated their citizens. The validity of the findings was investigated utilizing posterior probabilities and the likelihood of causal impacts. The outcomes are presented in Table 3 and Figs. 16, 17, 18, 19, 20, 21, 22, 23, 24, 25. These impacts’ posterior odds of occurring as random events are far too low., as we can see. The chances of causal effects, on the other hand, are relatively high. This merely demonstrates the importance of immunizations’ causal effects in each of the five countries involved. The vaccine reduced the number of cases by 9.7 percent, 10 percent, and 12.3 percent in the United States, the United Kingdom, and the United Arab Emirates, respectively. Likewise, these countries saw a decline in death rates, with 17.9 percent, 7.7 percent, and 3.8 percent for the United States, the United Kingdom, and the United Arab Emirates, respectively. As a result, these countries may have benefited from a high immunization rate in their population. On the other hand, there is no evident impact of vaccination in terms of cases and deaths in Bahrain, but due to poor vaccination rates in India, just 15 percent of the population received the vaccine on June 30. Vaccination is not having a visible impact in India in terms of cases and deaths. The current estimates for the overall number of illnesses and deaths are significantly higher than those predicted during the vaccine period. As a result, India must improve the speed with which vaccines are administered so that people can return to their normal lives.

Table 2 After one month, a summary of projections and predicted resource requirements is provided (on August 30, 2021)
Table 3 Vaccination’s impact in five nations
Fig. 1
figure 1

Case forecast USA

Fig. 2
figure 2

Death forecast USA

Fig. 3
figure 3

Vaccination forecast USA

Fig. 4
figure 4

Case forecast UK

Fig. 5
figure 5

Death forecast UK

Fig. 6
figure 6

Vaccination forecast UK

Fig. 7
figure 7

Case forecast UAE

Fig. 8
figure 8

Death forecast UAE

Fig. 9
figure 9

Vaccination forecast UAE

Fig. 10
figure 10

Case forecast Bahrain

Fig. 11
figure 11

Death forecast Bahrain

Fig. 12
figure 12

Vaccination forecast Bahrain

Fig. 13
figure 13

Case forecast India

Fig. 14
figure 14

Death forecast India

Fig. 15
figure 15

Vaccination forecast India

Fig. 16
figure 16

Causal impact of number ofcumulative cases in USA

Fig. 17
figure 17

Causal impact of number of cumulative deaths in USA

Fig. 18
figure 18

Causal impact of number of cumulative in cases UK

Fig. 19
figure 19

Causal impact of number of cumulative deaths in UK

Fig. 20
figure 20

Causal impact of number of cumulative cases in UAE

Fig. 21
figure 21

Causal impact of number of cumulative deaths in UAE

Fig. 22
figure 22

Causal impact of number of cumulative cases in Bahrain

Fig. 23
figure 23

Causal impact of number of cumulative deaths in Bahrain

Fig. 24
figure 24

Causal impact of number of cumulative cases in india

Fig. 25
figure 25

Causal impact of number of cumulative deaths in India

6 Conclusion

According to a literature survey, there hasn’t been any research into the separation of the components in relation to the changing behavior of the COVID-19 trends. The BSTS models disaggregate the COVID-19 trends through various components, which is an important study aspect. The proposed method also permits the coefficients to fluctuate over time, allowing for better detection of the data creation process. We showed that BSTS models could help with early preparation, prioritization, and distribution of healthcare resources to mitigate COVID-19 effects in the nations studied. Furthermore, the causal effects of vaccination have been studied. With a few exceptions, the study’s findings imply that the proposed models forecasting accuracy is superior to that of commonly used ARIMA models. Among these countries,The number of instances is predicted to increase by a percentage to be high in the UK. The % rise in the total number of cases will be ranked in the following order, UK> UAE > Bahrain> India >United States of America. Among these countries, In India, the rate of increase in the number of deaths is predicted to be significant. The % rise in the total number of deaths will be ranked in the following order, India> UAE > Bahrain> UK >USA. Our research also suggests that in the United States, To lower the number of cases and deaths, the United Kingdom and the United Arab Emirates have implemented a successful vaccine plan. On the other hand, India is still battling to manage the number of deaths due to tardy immunization or a large population. India may need to rethink their immunization strategies. By using effective and quick vaccination, the United States and the United Kingdom have reduced the number of mortality. However, the situation in India may become more problematic during the following sixty days. These findings, we believe, will assist these countries in efficiently prioritizing, devising, and implementing policies to prevent the pandemic’s expected consequences.

There are several limitations to this study as well. We presume that the information obtained is correct; nevertheless, because not all patients are admitted to clinics, and others are asymptomatic, the data may be underreported. No risk factors have been evaluated due to a lack of corresponding data. Despite the fact that BSTS models gave better projections than ARIMA models, the precision of these forecasts may be harmed by the data’s inherent uncertainty. However, the study’s goal is not to produce 100 percent accurate projections but rather to provide key signals to stakeholders so that they may organize their strategies accordingly.