Introduction

Increasingly frequent and severe wildfires associated with climate change release vast quantities of smoke into the atmosphere1, generating plumes that travel thousands of kilometers2 and expose millions of water bodies to smoke for weeks to months40, to quantify the spatial and temporal patterns of smoke cover in California from 2006 to 2022. This product provides a daily smoke plume density polygon over North America at a 4 km resolution by integrating near real-time polar-orbiting and geostationary satellite imagery from Geostationary Operational Environmental Satellite Program (GOES), Moderate Resolution Imaging Spectroradiometer (MODIS), and Advanced Very High Resolution Radiometer (AVHRR). This remote sensing product classified smoke plumes into three categories: low, medium, and high density, based on the estimated smoke concentrations of 5, 16, and 27 μg m−3, respectively.

To quantify the spatial extent and duration of smoke cover in California for each year, we made an annual composite map of smoke cover by intersecting daily smoke plume polygons with each intersecting polygon recording the number of smoke days for a given year. All areas exposed to smoke for at least one day were then summarized to quantify the annual spatial extent of smoke cover. This process was repeated for each month to evaluate the seasonal and interannual patterns of smoke cover extent in California, for each smoke density. In further analyses, we focused on medium and high-density smoke cover (hereafter ‘med-high density’) rather than low density smoke cover because we assumed more dense smoke cover would be of greater ecological relevance (e.g., more likely to reduce SW radiation fluxes and deposit particulates into lakes).

We assessed time series of the maximum extent of med-high density smoke cover in the months June-October, as well as annual and seasonal means, for monotonic trends by computing Sen’s slopes and applying the Mann-Kendall test using the ‘wql’ package in R41.

In addition to quantifying smoke cover throughout California, we generated a daily smoke density sequence over each study lake from 2006 – 2022. First, we obtained lake shapefiles from the California Lake database maintained by California Department of Fish and Wildlife (CDFW)42. For study sites that were not included in the California Lake database (e.g., small ponds in Sequoia National Park), we used a 100 meter buffer around the central point in the lake as an approximation of the lake surface. We then assigned a daily smoke density value to each lake by comparing spatial relationships between smoke plume polygons and lake surfaces. If a smoke plume intersected a lake’s surface area, we assigned the corresponding smoke density to the lake based on the date. If multiple smoke densities were assigned to the same lake on the same date, only the highest smoke density was assigned.

Characterizing lake exposure to smoke during study period

We identified periods of smoke cover for each lake during the study years (2018, 2020, 2021) using a combination of the daily smoke density value (described in previous section), SW radiation measurements from local weather stations, PM2.5 concentrations, and visual inspection of Sentinel satellite images to confirm the presence of smoke plumes.

At each lake, we used both the remote sensing-derived smoke density values and local meteorological data to conservatively classify each day as ‘smoke’ or ‘non-smoke’. We modeled theoretical ‘clear-sky’ SW radiation (SWclear.sky) for each day using a statistical clear sky algorithm43. We then subtracted the measured daily mean SW (SWmeas) from SWclear.sky (SWdiff = SWclear.sky−SWmeas). We calculated the median value of SWdiff on days with smoke density of zero across all 9 meteorological datasets (median SWdiff = 20 W m2). Days were then classified as smoke days if they met two conditions: (1) daily mean SW radiation was reduced by more than 20 W m2, and (2) smoke density was medium or high.

For each lake-year combination, we characterized the following attributes of smoke exposure: (1) the total number of smoke days between July 1- Oct 1; (2) the intermittence of smoke cover, defined as the mean, median, and maximum number of consecutive smoke days that occurred in each dataset; and (3) the cumulative reduction in SW radiation relative to clear sky values on smoke days (‘cumulative SW deficit’). We calculated cumulative SW deficit by summing SWdiff on all smoke days between July 1 and October 1. Attributes of smoke cover were only quantified between July 1 - October 1 because some datasets were incomplete outside this seasonal window.

Estimating aquatic ecosystem metabolic rates

We modeled daily rates of gross primary production (GPP; mg DO L1 d1), ecosystem respiration (R), and net ecosystem production (NEP = GPP - R) in the surface mixed layer of our study sites using hourly DO (mg L1), water temperature (° C), SW radiation (W m-2), and wind speed (m s1) data using the Lake Metabolizer R package44. The metabolism models in Lake Metabolizer have been used in diverse lake types (ex. arctic, alpine45, forested, agricultural46), and are described in detail in Winslow et al.44. A metabolism model was fitted to each DO time series using the following equation:

$${{{{{{\rm{DO}}}}}}}_{{{{{{\rm{t}}}}}}+1}={{{{{{\rm{DO}}}}}}}_{{{{{{\rm{t}}}}}}}+{{{{{\rm{GPP}}}}}}-{{{{{\rm{R}}}}}}+{{{{{\rm{F}}}}}}+{{{{{\rm{\varepsilon }}}}}}$$
(1)

F is the flux of oxygen between the lake and atmosphere, and ε is the process error associated with vertical or horizontal mixing. We used the ‘metab’ function and bayesian model to estimate daily parameters for GPP and R as well as associated uncertainty in each estimate (expressed as a standard deviation; reported in Supplementary Table 1). In the bayesian model, PAR (μmol m2 s1) and water temperature are covariates used to model rates of GPP and R, respectively. In addition to hourly DO, water temperature, SW radiation, and wind speed, the following model inputs were used: depth of the surface mixed layer at each time step (zmix; m), the attenuation coefficient for PAR (kd; m−1), and lake surface area (m²).

For pelagic sites in lakes that stratified seasonally or periodically (Emerald Lake, Topaz Lake, Castle Lake, Clear Lake) we estimated metabolic rates in the surface mixed layer. We calculated mixed layer depth (Zmix) using depth-distributed water temperature measurements from fixed depth sensors or vertical profiles using LakeAnalyzer in R47. For littoral sites within stratified lakes (Castle Lake, Dulzura Lake, Lake Tahoe), and in small, shallow water bodies that did not stratify (TOK 11 Pond, EML Pond 1, Topaz Pond), Zmix was set to lake depth at the location of the DO sensor. In the tidally-influenced Delta, Zmix was set to the mean depth of the channel within the range of the tidal excursion.

To estimate oxygen fluxes across the air-water interface (F), we used a wind-based gas exchange model that accounted for lake surface area48. We set gas exchange to zero during periods when the DO sensor was below the diel or seasonal thermocline. We estimated average PAR within the surface mixed layer by converting shortwave radiation measurements from weather stations to surface PAR and then using the attenuation coefficient for PAR (kd; m−1; Table 1; Supplementary Fig. 4) and Zmix to estimate mean water column PAR as in Staehr et al.49. Days with unrealistic metabolism estimates (negative GPP, positive R) were excluded from results. Additional details on datasets and metabolism models can be found in the Supplementary Methods.

Quantifying effects of smoke cover on ecosystem metabolic rates

We quantified ecosystem metabolic responses to smoke cover (e.g., compared GPP, R, and NEP between smoke and non-smoke days) by fitting generalized additive mixed models (GAMMs) to the daily metabolism estimates using the ‘mgcv’ R package50. We combined the datasets to fit a single GAMM each for GPP, R, and NEP. To facilitate comparisons across sites spanning from hyper-eutrophic (Clear Lake) to ultra-oligotrophic (Lake Tahoe), we standardized metabolism time series by mean and variance (z-score) before combining datasets. We modeled daily metabolic estimates as a function of smoke cover (categorical: smoke or non-smoke) and day of year (doy; smooth term). We included an interaction between doy and smoke (e.g., estimated separate seasonal smooths terms for non-smoke and smoke days) to visualize the effect of smoke cover on seasonal patterns in metabolism. We included a random effect for site in each model to account for the non-independence of repeated measurements in each lake. As an example using R pseudocode, the model formulation for GPP is: GPP ~ re(site) + smoke + s(DOY, by = smoke), where re() is a random effect, smoke is a parametric term, and s() indicates a smooth term with an interaction. We used default thin plate regression splines for the smooth terms. GAMMs were fitted using restricted maximum likelihood estimation.

To quantify how lake and smoke attributes mediated metabolic responses to smoke cover, we calculated the median difference in standardized metabolic rates between smoke and non-smoke days for each dataset (‘metabolic response’; ΔGPP, ΔR), and then fitted linear regressions between metabolic responses and lake or smoke variables. Lake variables included log-TDN, log-TDP, log-chla, and mean summer water temperature. Smoke variables included the total number of smoke days (duration), the mean SW reduction on smoke days (metric of smoke density; W m−2) and the cumulative SW reduction on smoke days (106 J m−2), a metric of smoke intensity that incorporates both duration and density.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.