In March 2020, the World Health Organization declared pandemic the new Sars-CoV-2 virus outbreak that infected millions of people worldwide, with consequent hundreds of thousands of fatalities. Understanding why the SARS CoV-2 is selectively spreading, i.e., to understand why the virus strongly hits some parts of the world with a subsequent large number in terms of infected people and fatalities, while others regions are spared with much lesser infected and fatalities, is of fundamental importance to implement strategies at government level to contrast and contain any possible outbreak. Recent studies (Liu et al. 2020; Lolli et al. 2020) have already highlighted how the meteorological variables, e.g., temperature and humidity, can affect COVID-19 pandemic transmission. In this study, we assessed how the atmospheric particulate and the ozone tropospheric concentration affect COVID-19 pandemic transmission in four major metropolitan areas in Italy. We collected the main meteorological and air-pollution-related variables from 1 February 2020 to 31 May 2020 in Milan, Trento, Florence, and Rome. We tested the non-linear Kendall and Spearman correlations between those parameters and the residual number of the hospitalized patients in the intensive care unit (ICU) as shown in Lolli et al. (2020). The number of hospitalized patients in ICU unit is a much stronger indicator of COVID-19 pandemic transmission because it is independent of the number nasopharyngeal swabs performed. In agreement with Lolli et al. (2020), we consider a delay of 19 days between the infection and the development of the acute respiratory distress syndrome (ARDS) that requires patient hospitalization into the ICU unit in critical conditions. For this reason, both the meteorological and air-pollution data are 19 days back time-shifted, as shown in Lolli et al. (2020). This means that the daily number of ICU patients from 24 February 2020 to 14 June 2020 is based on infections that happened from 5 February 2020 to 26 May 2020. The ICU per-day cases are modeled following the Gaussian mixture model (GMM) that seems to better represent the COVID-19 behavior in terms of infections and ICU hospitalizations as reported in Singhal et al. (2020). In this study, we adopted the Bi-Gaussian model. This choice is corroborated by some tests performed that put in evidence the inadequacy of a simple Gaussian in modeling the epidemiologic trend, while Gaussians with more than three terms are overfitting the data canceling all the valuable information. The ICU hospitalized number of patients show different trends with respect to time, i.e., in the early phase, the ICU patient number grows exponentially up to a plateau and it is followed by an exponential drop in the late phase. The curve symmetry is strictly dependent, among other variables, on lockdown policies implemented at government level. For this reason, the correlation analysis would give very different results if applied on a different temporal period, i.e., the results from Spearman and Kendall rank tests during the growing phase will be completely different with respect to the drop phase. To make the analysis independent on those issues, we consider instead the per-day residual number of ICU patients with respect to the GMM model, extrapolated from the data trend. The model should account for the natural tendency of the viral epidemic and the effect of the lock-down on it. Thus, the residual analysis (i.e., the differences between the GMM model and the observed cases) should preserve from spurious correlations between the above-mentioned effects and the parameters under analysis. Indeed, the considered atmospheric parameters quickly change (sometimes day-to-day), representing a divergence factor (residue) with respect to the model and characterizing the existing anomaly about the classical behavior described by the model.

1 Results

In Fig. 1, we show the model, and the per-day number of ICU hospitalized patients for Milan, Trento, Florence, and Rome and the corresponding residuals.

Fig. 1
figure 1

ICU-admitted patients fitted by a Bi-Gaussian function (red line) extrapolated from the observed data (black circle dots). We use the residuals to investigate the correlation with the meteorological and air-pollution variables. In blue, we show the residuals (GMM–ICU patients)

The correlations between COVID-19 pandemic and meteorological and air pollution variables were investigated using non-linear Spearman and Kendall rank correlation tests. The Spearman rank correlation non-parametric test rs is (Lolli et al. 2020):

$$ {r}_s=1-\frac{6\times {\sum}_i{d}_i^2}{n\left({n}^2-1\right)}, $$
(1)

where di is the difference between the ranks of two parameters, and n the number of alternatives. Equation (2) shows the Kendall rank correlation non-parametric test τ:

$$ \boldsymbol{\tau} =\frac{\boldsymbol{concor}-\boldsymbol{discor}}{\mathbf{0.5}\times \boldsymbol{n}\times \left(\boldsymbol{n}-\mathbf{1}\right)}, $$
(2)

where concor represents the number of concordant pairs, while discor represents the discordant pairs, and n is the number of pairs. Values of rs and τ equal to + 1 and − 1 imply a perfect positive and negative correlation, respectively. We analyzed the non-linear correlation between the daily max temperature (Tmax), the daily average temperature (Tavg), and the minimum daily temperature (Tmin). For humidity, the correlation was tested for the maximum, average and minimum dew point (DP) temperature, denoted as DPmax, DPavg, and DPmin, respectively. Moreover, the water vapor (WV in g kg−1) concentration and the absolute humidity (AH) in g m−3) through the Clausius-Clapeyron equation (Qi et al. 2020) are considered. These can be described through the following equations:

$$ \boldsymbol{WV}=\mathbf{6.22}\times \boldsymbol{RH}\times \frac{\mathbf{6.112}\times \mathbf{\exp}\left(\frac{\mathbf{17.67}\times \boldsymbol{T}}{\mathbf{243.5}+\boldsymbol{T}}\right)}{\boldsymbol{P}}, $$
(3)
$$ AH=2.1674\times RH\times \frac{6.112\times \exp \left(\frac{17.67\times T}{243.5+T}\right)}{273.15+T}, $$
(4)

where RH is the daily averaged relative humidity, T is the daily averaged temperature, and P is the daily averaged atmospheric pressure. As for the air-pollution parameters, we tested the correlations for the fine particulate matter (PM2.5) and the ozone (O3) concentrations. The meteorological data are publicly available on https://wunderground.com, while the air-pollution data, i.e., PM2.5 and O3 daily averaged concentrations, are freely available (or on request for Milan and Lombardy region) from the regional environmental protection agency websites.

The results, reported in Table 1, put in evidence that the ozone concentration, the temperature, and the humidity (except for Florence) strongly negatively correlates with COVID-19 pandemic transmission for the analyzed metropolitan areas. For the PM2.5 concentrations instead, a positive correlation is found for all the analyzed areas. Table 2 shows correlation significance through the p value. In other words, the correlations are gauged against some “null” hypothesis, i.e., by computing the probability that a totally uncorrelated dynamics would generate ranks that have a Spearman and Kendall correlation at least as high as the one computed from the actual residuals. In this work, we assume the correlation significant when the p value is less than 0.01.

Table 1 Analysis on meteorological and air pollution parameters. Temperature and ozone correlate significantly with ICU residual patient number for the metropolitan areas; ns, a correlation not statistically significative
Table 2 Statistical significance of the correlation analysis. The correlation is gauged against some “null” hypothesis. We considered the correlations statistically significant when the p value is less than 0.01. In italics are the non-significant

2 Discussion

To the best of our knowledge, no previous studies demonstrated a strong and clear negative correlation between ozone concentration and COVID-19 pandemic transmission. On the contrary, positive correlations with the atmospheric particulate were already assessed (Di Girolamo 2020). This result can be assumed as a secondary factor to explain the differential virus transmission in the different parts of the world. Dubuis et al. (2020) corroborates this speculation, as their findings suggest that low concentration ozone is a powerful disinfectant for airborne viruses as well as the higher humidity of the air. Of course, in their work, the ozone concentrations are much higher, but in the atmosphere the ozone concentration lasts for much more time. Moreover, O3 production in the troposphere is strongly linked to sunlight and pollutants, i.e., precursors as NO2. Conversely, the presence of black carbon in the polluted metropolitan areas inhibits the ozone production in the boundary layer (Li et al. 2005). All those factors could partially explain the differential transmission. The results highlight that the ozone concentration should be considered as a co-factor in COVID-19 pandemic transmission, while the epidemiologic aspects are of paramount importance and have obviously the primary role.