Abstract
Purpose
Fine sediment deposition is an important component of the catchment sediment budget and affects river morphology, biology, and contaminant transfer. However, the driving factors of fine sediment deposition remain poorly understood at the catchment scale, limiting our ability to model this process.
Methods
Fine sediment deposition and river reach characteristics were collected over the entire river network of three medium-sized (200–2200 km2) temperate catchments, corresponding to 11,302 river reaches. This unique database was analyzed and used to develop and evaluate a random forest model. The model was used to predict sediment deposition and analyze its driving factors.
Results
Fine sediment deposition displayed a high spatial variability and a weak but significant relationship with the Strahler order and river reach width (Pearson coefficient r = −0.4 and 0.4, respectively), indicating the likely nonlinear influence of river reach characteristics. The random forest model predicted fine sediment deposition intensity with an accuracy of 81%, depending on the availability of training data. Bed substrate granularity, flow condition, reach depth and width, and the proportion of cropland and forest were the six most influential variables on fine sediment deposition intensity, suggesting the importance of both hillslope and within-river channel processes in controlling fine sediment deposition.
Conclusion
This study presented and analyzed a unique dataset. It also demonstrated the potential of random forest approaches to predict fine sediment deposition at the catchment scale. The proposed approach is complementary to measurements and process-based models. It may be useful for improving the understanding of sediment connectivity in catchments, the design of future measurement campaigns, and help prioritize areas to implement mitigation strategies.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Hillslope erosion can include a significant proportion of fine sediment particles (< 2 mm diameter; Vericat and Batalla 2006). In rivers, fine particles can result in multiple environmental issues, such as increased river turbidity and streambed clogging (Brunke 1999), which are detrimental to the aquatic environment (Owens et al. 2005) and preclude the achievement of the Water Framework Directive objectives (Bilotta and Brazier 2008). Fine sediment deposition may reduce riverbed infiltration capacity, which in turn affects the hyporheic zone dynamics by modifying the water flow (Boano et al. 2014). Moreover, particle transfer can contribute to the transport of contaminants, such as heavy metals and polycyclic aromatic hydrocarbons, as they may have a high affinity for fine particles (Conaway et al. 2013). Understanding fine sediment deposition as part of river sediment dynamics (Waters et al. 1995) is therefore of great importance, as it may also affect sediment-associated chemical (Droppo et al. 2015), biogeochemical (Nogaro et al. 2007) and microbial (Nogaro et al. 2010) processes and dynamics, as well as aquatic life (Wood and Armitage 1997; Scheder et al. 2015), including fish (Kemp et al. 2011) and invertebrates (Kefford et al. 2010; Wagenhoff et al. 2011; Magbanua et al. 2016).
Fine sediment storage and erosion in river channels and floodplains have been found to be a significant part of the catchment sediment budget in various environments, including lowland (Owens et al. 1999; Walling et al. 1999; Collins and Walling 2007; Marttila and Kløve 2014) and mountainous catchments (Navratil et al. 2010; Misset et al. 2021). It underlines the need to study this process further to improve our understanding of sediment (dis)connectivity in catchments (Fryirs 2013). The deposition and accumulation of fine particles within river reaches may turn riverbeds into a significant supply of fine sediment to downstream environments (Fryirs and Brierley 2001), which may be mobilized during flood events (Droppo 2004), making sediment deposition an important process for understanding fine sediment dynamics at the catchment scale. However, several issues limit our ability to understand and model fine particle deposition. Multiple techniques have been proposed in the literature to estimate sediment deposition and subsequent riverbed clogging, such as visual estimations (Owens et al. 1999; Zweig and Rabeni 2001), embeddedness (Platts et al. 1983), resuspension method (Lambert and Walling 1988), sediment shear strength measures (Grabowski et al. 2010; Legout et al. 2017), sediment coring (Milan et al. 2000), wooden stakes (Marmonier et al. 2004), dynamic penetrometry (Landemaine et al. 2015), sediment trap measurements (Seydell et al. 2009), shuffle index (Clapcott et al. 2011), or infiltration capacity measurements (Datry et al. 2015). Each of these techniques has specific advantages (Descloux et al. 2010), but their respective results cannot be compared, as they target different proxies of sediment deposition (e.g., resistance to shear stress, oxygenated depth, and infiltration capacity). Measurements are also labor- and time-consuming, limiting large-scale evaluations, particularly considering that fine sediment deposition can exhibit high spatial variability at the catchment scale (Haddad et al. 2022). Such requirements and variations in measurement protocols may explain the lack of standardized, in-stream, monitoring programs (Wharton et al. 2017), limiting our ability to describe this complex process.
Once particles enter river channels, suspended particle concentrations and flow shear stress affect particle characteristics (Grangeon et al. 2014; Wendling et al. 2016), particularly particle size (Dyer 1989; Maggi 2005), turning soil aggregates into complex organo-mineral composites usually referred to as flocs (Droppo et al. 1997; Droppo 2001; Spencer et al. 2021) and changing their depositional dynamics (Droppo and Ongley 1994; Droppo et al. 2005). After particle deposition on the bed surface, Brunke (1999) described clogging occurrence through particle infiltration and retention in the bed, resulting in pore bridging and particle retention. This reduces bed permeability (Schälchi 1992) and porosity (Gayraud and Philippe 2003) and creates a clogging depth, ultimately leading to sediment-column clogging. Experimental studies have demonstrated that multiple factors may affect fine sediment deposition and riverbed clogging, including the suspended-to-matrix particle size and shape (Hutson 2014), suspended particle concentration (Pholkern et al. 2015), particle characteristics (Rehg et al. 2005), hydraulic gradient between river and groundwater (Schälchi 1992), and flow conditions (Fetzer et al. 2017). This may also depend upon the successive occurrence of flood events (Blaschke et al. 2003) and the depositional history (Lau et al. 2001). However, the upscaling of this research to the catchment scale remains limited, which hampers the conceptualization and representation of fine sediment deposition and potential clogging in catchment-scale models (e.g.Arnold et al. 1998; Bieger et al. 2017) that may be used to improve our understanding of the sediment cascade and design relevant mitigation strategies to limit the deleterious effects of fine sediment deposition. Such models indeed face the issues of model parameterization, including the difficulty of adequately representing the nonlinear interactions of river and hillslope processes and the subsequent interactions between physical and biological factors in river channels (Shrivastava et al. 2020).
Field-based studies performed over multiple river reaches can help determine the controlling factors of particle deposition and parameterizing models. Indeed, they implicitly consider multiple processes affecting sediment deposition, evaluated over numerous river reaches and conditions. Datry et al. (2015) analyzed clogging through hydraulic conductivity in more than 100 river reaches in France over 2 years, both in winter and summer, to prioritize the factors significantly affecting riverbed clogging. Interestingly, one of their main results was that clogging varied more over the different monitored reaches than over seasons. Naden et al. (2016) analyzed fine sediment deposition in 230 catchments in England and Wales and found that stream power and flow velocity exhibited a significant relationship with deposited sediment density in streams saturated with fine particles, suggesting that deposited sediment dynamics were controlled by in-stream dynamics. In their study, the sediment pressure, mostly from agriculture, was also significant, as also indicated by Sutherland et al. (2010) and Konrad and Gellis (2018). In another large-scale study, Stewardson et al. (2016) developed a model of clogging based on hydraulic conductivity data acquired from 153 river reaches. They showed that the river reach geometry, catchment characteristics, and stream power significantly controlled riverbed clogging.
The different controlling factors proposed in these studies also highlights the potentially highly variable dynamics of fine sediment deposition and clogging, resulting from both in-stream and hillslope processes that may vary both between and within catchments. The significant relationships as well as model predictions presented by Naden et al. (2016) included scattering between modeled and measured values over more than one order of magnitude. Similarly, the model by Stewardson et al. (2016) explained 30% of the variance in external cross-validation, suggesting that these relationships could provide interesting variation trends, although they cannot be used as reliable predictors, limiting extrapolation possibilities.
Machine-learning models provide alternative approaches that can extract knowledge and draw inferences from data. As data-driven methods, they are implicitly able to consider multiple and potentially nonlinear relationships between variables. They have been successfully used to predict water quality, such as dissolved species content (Khullar and Singh 2021) but also for sediment spatial distribution and temporal variation modeling (Taşar et al. 2017; Harmshaw et al. 2018; Hou et al. 2019; Ren et al. 2021). Recent studies such as those conducted by Baldan et al. (2020) and Baldan et al. (2021) demonstrated the potential of modeling cascades to analyze sediment dynamics at the catchment scale. These studies made use of physically based catchment and river-reach models to feed a machine-learning model, with promising results. However, such an approach requires extensive datasets as model inputs to make accurate predictions, which may limit their applicability to numerous river reaches and extrapolation possibilities. To the best of our knowledge, the combined use of field-based monitoring and machine-learning approaches for modeling fine sediment deposition and the associated model performance and extrapolation possibilities have not been addressed in previous research.
In this study, we analyzed the spatial variability in fine sediment deposition at the catchment scale and attempted to gain insights into the driving factors, considering both hillslopes and in channel processes. To this end, an extensive and unique dataset, including the entire river network of three temperate catchments, was collected and analyzed. It was then used to develop a random forest model. To evaluate the model potential for application in unmonitored catchments, the model extrapolation performance was tested. This original modeling approach successfully predicted fine sediment deposition over the three monitored catchments and may therefore contribute to the understanding and management of sediment (dis)connectivity in catchments (Poeppl et al. 2020).
2 Materials and methods
2.1 Catchment location and characteristics
In this study, we considered three different catchments located in different French regions (Fig. 1). We hypothesized that variations in land use would induce differences in fine sediment supply to river channels. Consequently, two nearby catchments with contrasting land uses were selected for this study: the Beuvron and Cisse catchments are located in central France and are mostly covered with forested (72% of the Beuvron catchment) and agricultural areas (61% of the Cisse catchment), respectively. The third catchment presented a mix of land uses and was located in a different geological context: the Loisance catchment is located in Brittany, in western France. The lithology mainly consists of Proterozoic magmatic rocks (83%) in the Loisance catchment, while the Beuvron and Cisse catchments were mainly covered by Cenozoic sedimentary layers: clay and sands (92%) and limestone (80%), respectively. Land use was determined using Corine Land Cover. National crop statistics indicated that the agricultural lands in the Beuvron and Cisse catchments mainly corresponded to cereals (51% and 82%, respectively), such as wheat, maize, and barley. Of note, 43% of the agricultural lands in the Loisance catchment consisted of grasslands, while grasslands covered 20% and 3% of the Beuvron and Cisse agricultural lands, respectively. Catchment morphological characteristics were calculated using a 25-m resolution digital elevation model (DEM). The main catchment characteristics are summarized in Table 1.
2.2 Fine sediment deposition measurements
In this study, we focused on fine sediments, defined as particles with diameter finer than 2 mm (Walling and Moorehead 1989; Walling et al. 2000). Evaluating fine sediment deposition over the entire river network of the three different catchments required a compromise between data accuracy and the duration of the monitoring campaigns. We therefore did not follow a systematic approach but rather considered a river reach as a homogeneous entity in the field, delineated following expert knowledge and observations, following the method proposed by Dupeux and Favreau (2017), adapted from Archambaud et al. (2005). Consequently, in this study, the measured reach lengths varied; the reach length quantiles of 10%, 50% (median) and 90% were 39 m, 121 m, and 332 m, respectively. It is therefore acknowledged that some reaches may exhibit local, although limited, variations in their characteristics.
Fine sediment deposition was visually estimated based on the sediment areal coverage and water turbidity following manual stirring (Dupeux and Favreau 2017), which is similar to the shuffle index proposed by Clapcott et al. (2011). It differs from classical methods such as the one from Lambert and Walling (1988) or Navratil et al. (2010) in that it allows for a quick estimate of sediment deposition intensity, although only providing semi-quantitative results as suspended sediment concentration or turbidity is not measured. The sediment deposition intensity was divided into four classes: 0%, 25–50%, 50–75%, and 75–100% (Fig. 2). Although visual assessment methods were questioned in regard to their use in the quantitative assessment of sediment deposition intensity (Sennatt et al. 2006), recent studies have demonstrated that they may provide an appropriate method for deposit estimates (Conroy et al. 2016), as they was demonstrated to correlate well with quantitative estimates of sediment stocks (McKenzie et al. 2022) while providing quick estimates. Finally, surface stirring ensured that the extreme classes were reliably estimated and did not correspond to a thin layer of surficial clogging. However, it is acknowledged that the 25–50% and 50–75% classes may be difficult to discriminate in the field. Despite its drawbacks, this method might be among the few that may be applied to studies that include the entire river network with multiple catchments.
2.3 Dataset collection
In addition to fine sediment deposition intensity, multiple measurements were performed for the river reaches, including river geometry and flow conditions during the field campaign. We based variable selection on previous literature results and a national protocol (Gob et al. 2014). It is expected that the corresponding dataset will grow over time, which would provide interesting opportunities for model extrapolation. The monitoring protocol related to in channel variables is similar to the protocol proposed in Raven et al. (2003) and Clapcott et al. (2011), and is summarized in McKenzie et al. (2022).
It is hypothesized that stream power, depending upon channel geometry and channel slope, may be a variable controlling sediment deposition and erosion variability (Naden et al. 2016). Consequently, the main morphological characteristics of river reaches controlling the local hydraulics (Van Rijn 1993) were measured on the field through transects (reach width, depth, length) or calculated (reach slope, sinuosity). As Fetzer et al. (2017) suggested that flow condition is an important controlling variable, it was also visually estimated using previously established classes (Clapcott et al. 2011). It was assumed that habitat diversity might be a synthetic indicator of degraded river reaches, which may therefore be used as a proxy of river dynamics. Habitat diversity was estimated as a relative measure on each river reach. During the field survey, the relative proportion of wood debris (length > 30 mm), cobbles (width > 128 mm), plant roots, aquatic vegetation, and organic litter density was estimated for each river reach. Habitat diversity was then classified into 4 different classes, ranging from “none,” in the absence of habitat on the river reach, to “high” when both the density and variety of habitats was important (Le Bihan 2020). Finally, the visual assessment of the underlying bed substrate was found to correlate well with sediment deposited mass (Naden et al. 2016). It was therefore evaluated by walking up in the river reaches and estimating the dominant substrate type, in the same way as described in McKenzie et al. (2022).
In addition to within-channel dynamics, hillslope dynamics should be considered as potential sediment sources (Sutherland et al. 2010; Wagenhoff et al. 2011; Davis et al. 2021). In particular, agricultural areas may supply important fine sediment quantities to river reaches and should therefore be considered (Konrad and Gellis 2018). Therefore, the lower end of each river reach was considered the outlet of a subcatchment. For each of these subcatchments, the corresponding upstream surface area and the land-use proportions in the drainage area were calculated.
Finally, physical barriers such as weirs were observed on some of the river reaches (4.7% of the total river reaches). Such obstacles were assumed to affect the flow and the within-reach sediment dynamics. They were therefore included in the dataset.
Fifteen variables were therefore compiled from 11,302 river reaches. The data acquired, analyzed, and used for modeling are summarized in Table 2. A correlation matrix was calculated using Pearson’s r and Spearman’s ρ coefficients to analyze the variables that correlated with deposition intensity and to draw hypotheses on the factors affecting fine sediment deposition. The correlation matrix is provided as supplementary material (Table S1).
Important temporal variations in fine sediment deposition may occur in river reaches depending on the hydrological regime, particularly the occurrence of flood events (Genereux et al. 2008). The monitoring was performed mainly in spring and autumn, when limited water height variations were expected while maintaining a sufficient water level for analysis (McKenzie et al. 2022). However, acquiring this extensive dataset required intensive fieldwork: one day was required to measure approximately 10 km. A single measurement campaign was therefore available for analysis in this study (i.e., no temporal variations could be studied). As temporal variations may not always be the most important factor driving fine sediment deposition variability (Datry et al. 2015), it was assumed that the dataset acquired in this study may provide reliable information to analyze fine sediment deposition.
2.4 Prediction of sediment deposition intensity classes
In this study, fine sediment deposition was modeled using a machine-learning-based approach. Machine-learning models are able to autonomously build the relationships between the input and output variables, which is an important advantage when modeling complex processes such as sediment deposition. Among these algorithms, random forest was used because of its ability to address high-dimensional datasets, including a mixture of numeric and categorical variables. It is able to address nonlinear relationships, as well as correlated features, while maintaining good predictive performance (Skurichina and Duin 2002; Darst et al. 2018). Moreover, random forest has fewer hyperparameters (i.e., parameters controlling the learning process) than alternative machine-learning techniques. Data processing and random forest training were performed using the “caret” (Kuhn 2020) and “ranger” (Wright and Ziegler 2017) packages of R software v4.0.0 (R Core Team 2020).
Comparison exercises such as Hastie et al. (2009) and Probst et al. (2019) have shown the importance of tuning the number of variables randomly chosen in the learning procedure among all considered predictors in each split (designed by mtry) and the node size. The small number of parameters required by the algorithm reduces potential overlearning issues, computational costs, and optimization requirements. This is also an important benefit relative to physically based models, which usually face the issue of important parameter requirements, with associated equifinality and uncertainty issues (Beven and Freer 2001). Random forests also have the advantage of being able to estimate the relative importance of the variables.
However, two main challenges are worth pointing out. First, our dataset is relatively imbalanced (i.e., some classes are more frequent than others), which can pose some difficulties for classification models. This means that other machine- and/or deep-learning techniques (e.g., neural networks) may have provided better prediction performance or may have needed a smaller dataset to achieve similar prediction performance. However, these alternatives may not have necessarily provided variable importance that were analyzed to improve the interpretability of the proposed machine-learning-based results, which is a current research question (Molnar et al. 2020). Second, although of great practical importance, the question of variable importance is still a very active area of research (e.g., Iooss et al. 2022) because, depending on the context of the study (e.g., size of the training dataset, number of predictors, and dependence among them), there is currently no consensus on which approach to use. As an attempt to address this problem, we propose in Sect. 2.4.3 to apply multiple methods and to retain in the analysis only the results that were consistent between the different results.
From a more technical point of view, as a tree-based ensemble algorithm, random forest uses a large number of individual, unpruned decision trees to classify weak classifiers (Breiman 2001). Each decision tree was grown using a subset of training data built by random sampling among samples and variables. A random selection of samples by bootstrap aggregating favors the stability of classifiers with good performances. The selection of random subsets of variables ensured correlation reduction and increased the robustness of the results. Then, branching point selections were built considering the best split among those variables at each node (Rokach 2010). This process was repeated until each branch end contained less than a prespecified number of observations. After tree partitioning was completed, the classification of a new sample was performed considering all decision trees using a majority vote. In this setting, overfitting was avoided by growing many trees during the learning process.
2.4.1 Data preprocessing
Deposition intensity was initially divided into four distinct classes: no sediment deposition (0%) and weak (25–50%), moderate (50–75%), and high (75–100%) sediment deposition intensity. A preliminary modeling attempt revealed that the model had difficulties discriminating between the 25–50% and the 50–75% classes. This result was consistent with the inherent difficulty in estimating the medium deposition intensity classes in the field (Sect. 2.2). Consequently, these classes were merged into a single class of intermediary deposition intensity to improve prediction performance. Three classes were therefore considered for prediction purposes: 0%, 25–75%, and 75–100%, referred to as “no sediments,” “medium deposition intensity,” and “maximal deposition intensity,” respectively. When variables had missing values (3% to 14%, only numerical values), they were replaced by the weighted average of nonmissing observations using the imputation method implemented within the “randomForest” R package (Liaw and Wiener 2002).
2.4.2 Model tuning and evaluation
The dataset was randomly divided into a training set (70% of the initial dataset) and a test set (30% of the initial dataset), and each set had the same proportion of the three deposition intensity classes. Hyperparameter estimation was based on a stratified, nested k-fold, cross-validation procedure (with k-folds = 5) to limit overfitting (Krstajic et al. 2014). Following Breiman (2001), three values were tested for mtry (2, 4, and 8) and three values for node size (1, 3, and 10). The final random forest model was trained by setting mtry to 8 and node size to 1 and using default parametrization for all other parameters.
Model performance was assessed considering several indicators computed on the test set: accuracy, precision, and recall. They were defined as follows:
where TP and TN denote true positives and true negatives, respectively, and FP and FN denote false positives and false negatives, respectively. These values were presented in a confusion matrix (Table S2) and illustrated the performances of the model for each class of deposition intensity. The values of these three indicators vary between 0 and 1. The closer the value is to 1, the better the model performance. Accuracy is a global measurement of the algorithm’s ability to predict deposition intensity. In this study, a light imbalance was observed among the three classes of deposition intensity with a ratio of 3:12:4 for no sediments, medium deposition intensity and maximal deposition intensity. This unequal class distribution may bias the interpretation of accuracy. Consequently, the model performance was also evaluated using precision and recall. Precision corresponded to the proportion of reaches for which the predicted deposition intensity was that observed in the field among all reaches associated with this deposition intensity by the model. Recall indicated the proportion of reaches for which the predicted deposition intensity was truly that observed in the field among all reaches that truly had this deposition intensity. Precision can thus be understood as a measurement of quality, while recall is a measure of completeness. These two metrics were calculated by considering the test set as a whole but also then distinguishing the results obtained for each catchment.
2.4.3 Variable importance
Variable importance in the random forest model is computed to determine the relative influence of each predictor in the final prediction. Several studies have indicated that variable importance measures may exhibit different flaws in the case of correlated features in a dataset (Hooker 2007; Hooker et al. 2021). We thus used several measures of variable importance based on different computing procedures. First, we considered the mean decrease in impurity (MDI), which calculates each feature importance as the total decrease in node impurity, measured by the Gini index, averaged over all trees of the forest. As this measure is known to be biased toward covariates with many possible partitions, such as categorical variables (Strobl et al. 2007; Wright et al. 2017), we considered its unbiased counterpart, actual impurity reduction (AIR) (Nembrini et al. 2018). We then applied the testing procedure of Altmann et al. (2010) to produce a p-value for each variable importance measure. Both were implemented in the R package “ranger.” An alternative to Gini importance is the use of permutation importance approaches based on dataset row or column permutation. They do not suffer from the bias of Gini importance (Szymczak et al. 2016) but may overestimate the importance of correlated variables (Hooker and Mentch 2021). We thus considered the methodology of Kursa and Rudnicki (2010) based on the addition of pseudo variables (called shadow attributes) in the model, implemented in the R package “boruta”. This procedure also includes statistical testing procedures for variable importance measures, which identify significantly important variables at the p-value scale.
Given that no standard method has been defined in the literature to compute variable importance (Iooss et al. 2022) and the uncertainties associated with the AIR and Boruta approaches, only those consistent results between both methodologies are discussed.
3 Results
3.1 Field measurements and statistical analysis
An overview of the measured deposition intensity is provided in Table 3. Due to differences in catchment area, the proportion of measured river reaches was higher in the Beuvron catchment (59.5%), followed by the Loisance (24.8%) and Cisse (15.7%) catchments. Most monitored river reaches had weak deposition intensity (37%), followed by moderate (27%) and high deposition intensity (21%) and no sediment deposition (15%). The measured fine sediment deposition exhibited contrasting behavior between and within catchments, associated with significant spatial variability (Fig. 3).
Most Cisse catchment (Fig. 3a) upstream reaches were connected to agricultural fields, while some were located close to forested areas in the central part of the catchment. In this catchment, moderate to high fine sediment deposition intensity was measured in 70% of the reach lengths. Conversely, the other 30% mostly corresponded to upstream river reaches located in the forested part of the catchment.
In the Loisance catchment (Fig. 3b), a smaller proportion of river sections with deposition intensity higher or equal to moderate deposition intensity was measured. Indeed, 51% of the river reaches displayed no sediments or weak fine sediment deposition. Important local variations were measured with, e.g., alternations of high deposits and no sediments on successive river reaches.
In the Beuvron catchment (Fig. 3c), no clear deposition tendencies were observed in relation to land use. Indeed, weak and high deposition intensities were observed even in the most agricultural part of this catchment. However, in this catchment, the largest analyzed in this study (Table 1), fine sediment deposition exhibited a pattern of increased deposition from upstream to downstream areas, while first-order reaches exhibited a high number of river reaches with fine sediment deposition intensity higher or equal to moderate intensity. Indeed, 56% of the total reach length exhibited moderate to high deposition intensity on low Strahler-order reaches (lower than 3), suggesting that first- and second-order streams were more susceptible to fine sediment deposition. In the higher Strahler-order reaches (higher than 3), a progressive increase from weak sediment deposition intensity in the upper reaches to moderate and high deposition intensity in the lower reaches was observed.
This observation was in line with the correlation matrix results, indicating that the variables that correlated best with clogging intensity were the channel width (r = 0.4, ρ = 0.3) and Strahler order (r = −0.4, ρ = −0.4) (Fig. 4). In our study, the autocorrelation between the Strahler order and channel width was low (r = −0.2, ρ = 0.1), which can be explained by the investigation of the three different catchments in the dataset.
It was not surprising, given the observed variations in fine sediment deposition within the three studied catchments, that only weak relationships were found between deposition and the local variables. It is, however, interesting to note that some variables correlated with fine sediment deposition intensity, particularly when considering each catchment separately (Fig. 5). Indeed, there was, for instance, no relationship between fine sediment deposition intensity and slope (r and ρ were < 0.1) using the entire dataset (Fig. 5a) and those for the Beuvron and Cisse catchments (Fig. 5b). However, the relationship between deposition intensity and slope was significant at the 5% level of probability for the Loisance catchment (r = −0.3, p = −0.3), suggesting that the local river reach characteristic variations changed local fine sediment deposition dynamics in this catchment.
3.2 Modeling fine sediment deposition
The global accuracy of the random forest model was 81% on the test set, demonstrating the strong predictive capabilities of machine-learning models to predict deposition intensity. The model performed well at predicting the different deposition intensity levels, as the precision was higher than 80% for all deposition intensity classes (Table 4). The exception was the 46% precision for prediction of the “no sediment” class of the Cisse catchment. However, this class only represented 120 river reaches (6.7% of the catchment river network). Consequently, the data-driven model presented in this study had difficulties predicting a small number of river reach conditions. Moreover, the confusion matrix (Table S2) indicated that the model tended to predict the medium deposition intensity class (25–75%) more often than observed, which may be related to the higher representation of this class in the initial data (Table 3). The global recall (70%) was also reasonable, but with significant between-class differences. For example, the medium deposition intensity class had a high precision and recall (81% and 93%, respectively), showing good model predictive ability. However, the “no sediment” class had high precision but poor recall (80% and 48%, respectively), indicating that the model adequately predicted no sediments on a river reach, but it was not able to retrieve more than half of the reaches with no sediment deposition. This result suggested that the no sediment class was less well-described by the dataset. The between-catchment differences in model performance may be partly related to their unbalanced representation in the whole dataset, as 59.5% of the data were from Beuvron, 15.7% were from Cisse, and 24.8% were from Loisance (Table 3).
The global model performance was promising. It was therefore assessed whether the model was able to reproduce the spatial fine sediment deposition patterns. The predicted values matched the measured pattern of fine sediment deposition in the three studied catchments reasonably well (Fig. 6). Interestingly, the model was able to capture the progressive deposition increase occurring in downstream directions along reaches (e.g., from east to west in the Beuvron catchment), as well as the high deposition intensity measured in first-order reaches. These results demonstrated that the proposed model was also able to predict the observed fine sediment deposition spatial organization. It may therefore provide an interesting tool to assess the dynamics of this highly heterogeneous process, provided sufficient relevant data are available in the training set.
Indeed, extensive databases such as those acquired in this study may not always be available to apply data-driven models such as the one proposed in this study. We therefore assessed whether the proposed approach could be applied in other contexts with fewer measurements. Several models were built considering various sizes of training data, from 1 to 70%, using 7% increments, from the training set (70% of initial data) (Fig. 7). The performances of these models were then computed using the test set (30% of initial data), allowing model comparison. The model accuracy increased from 59 to 81% with an increasing number of river reaches used for modeling. Being a data-driven model, this increase in model performance with increasing dataset size was not surprising. However, it is worth noting that using only 8% of the training set resulted in a 70% accuracy, which may be considered an acceptable result. The model had a better performance when only considering the Beuvron catchment than when considering the three catchments together, which was probably related to the higher number of river reaches analyzed in this catchment. Similarly, lower values of precision and recall for the Cisse catchment were probably related to the lower number of river reaches observed in this catchment. Accuracy was lower for the Loisance catchment (76%), which may be consistent with the observed highly spatially heterogeneous data measured in this catchment (Fig. 3b). Overall, these results indicated that most dataset variability was captured using a dataset including 8% to 15% of the catchment river reaches, corresponding to approximately 630 to 1190 river reaches measured across the three catchments, with a good model performance.
4 Discussion
4.1 Driving factors of fine sediment deposition
Given that the model performance can be considered acceptable for various deposition classes, locations, and contrasted river reaches, it was used to determine the hierarchy of the factors controlling fine sediment deposition through the computation of their relative importance.
Bed substrate granularity, flow condition, reach depth and width, cropland cover proportion, and forest and grassland cover proportions were the six most influential variables in regard to deposition intensity regardless of the considered methodology (Figs. S1 and S2). Interestingly, the presence of barriers on the river reaches did not provide a significant explanatory variable. This result suggested either that the barriers did not affect a sufficient number of reaches or that the corresponding reaches were characterized by configurations that would have resulted in fine sediment deposition even in the absence of barriers (e.g., limited slopes and high particle contributions from hillslopes).
The importance of bed substrate granularity, previously noted by Sutherland et al. (2010), may reflect at least two different processes (Shrivastava et al. 2020). The suspended particle size, compared to the size of particles deposited on the riverbed, may increase or limit their accumulation on the riverbed because of pore bridging (Brunke 1999) and associated decreased bed permeability, further consolidating the bed. It will therefore contribute to the persistence or absence of fine sediment deposition (Fetzer et al. 2017). The underlying bed substrate may also reflect the underlying material and therefore its erodibility, as the texture is usually correlated with soil erodibility in the literature (Torri et al. 1997). This hypothesis would be consistent with the model variable ranking results, indicating that land-use indicators are meaningful variables to explain the deposit intensity variability, particularly the proportion of cropland and forest cover in the reach subcatchments. Moreover, it was observed in the Cisse catchment that most upstream reaches, displaying high sediment deposition intensity, were connected to agricultural fields (Fig. 3a), which produce higher suspended-sediment concentrations than forests or grasslands during erosive rainfall events (e.g., Cerdan et al. 2010). They may therefore supply high fine particle loads to river channels. The combination of erodible fields and a high field-to-stream connectivity (Fryirs 2017) resulted in medium to high deposition intensity in 70% of the reach lengths, as also found by Sutherland et al. (2010), Naden et al. (2016), and Konrad and Gellis (2018). This would also suggest that in this lowland, mostly agricultural catchment, transport capacity-limited conditions controlled most fine sediment deposition dynamics, as supported by the limited slopes in this catchment (mean: 3%).
The flow conditions are representative of the local stream morphology and integrate various variables that may be particularly important in controlling fine sediment deposition (McKenzie et al. 2022), particularly successive deposition–erosion cycles. Similarly, and in agreement with the correlation matrix, river reach width and depth, which directly influences water velocity and therefore sediment settling, was an important variable, as previously suggested in the literature (Datry et al. 2015). This importance may be explained by an increasing channel reach width resulting in lower fluid velocity and, consequently, increased sediment settling. This result is consistent with those of Stewardson et al. (2016). Similarly, a lower Strahler order corresponded to channels with a direct connection between river reaches and sediment sources, which may indicate why this variable correlated with sediment deposition intensity, as also reported by Relya et al. (2012). This result would support the first evidence provided by the statistical analysis (Sect. 3.1). However, the high dispersion in the relationships between deposition intensity and channel geometry indicated that the relationship is probably complex and highly nonlinear (maximum correlation coefficient was 0.4; Table S1). The model performance indicated that it successfully captured this nonlinear behavior. This result may explain the alternation of maximal deposits and no sediments on successive river reaches, as observed in the Loisance catchment, as well as the correlation between sediment deposition and the local slopes, as previously suggested by Naden et al. (2016), both of which suggests the significance of local in-stream processes in this catchment.
Consequently, the results from this study indicate the importance of considering sediment sources as well as in-stream processes when analyzing fine sediment deposition. The Beuvron catchment provided an example of the linkage between hillslopes and river processes, with various deposition intensities measured in first-order reaches and a progressive increase in sediment deposition in higher reaches. This result suggested that first- and second-order streams were more susceptible to fine sediment deposition, probably because of increased connectivity between hillslope sources and streams and their lower transport capacity (Relya et al. 2012). The model performance was slightly lower on the first Strahler-order reaches compared to those with a higher Strahler order: modeled values matched measurements 73% and 79% of the time, respectively. This may be explained by the relatively general descriptions of hillslope processes, as the current study only considered land use and the area upstream of river reaches (Table 2). This result suggested the strong contribution of sediment sources upstream with possible transport capacity-limited conditions, as supported by the mean slope (1%), while the downstream reach deposition intensity may be controlled by fine sediment availability in this lowland forested catchment.
This study provides the first general illustration of the potential of this methodology, combining hillslope and river process representations. It may therefore be a relevant contribution to study sediment (dis)connectivity in catchment (Fryirs 2013), providing an additional tool to study the catchments sediment sinks. Despite providing an extensive dataset, some processes or specific combinations of river reach characteristics were overlooked in the present study and should be further refined. Indeed, it should be noted that some residual variability remained regardless of the number of measurements used to train the model (Sect. 3.2) and was not captured by the variables included in the dataset. Further developments may rely on the use of numerical modeling to improve the representation of hillslope and river reach processes in such methodology (Baldan et al. 2021). Moreover, due to time constraints in field monitoring, only one measurement campaign was used in this study, while the progressive deposition and flushing effects of flood events should influence fine sediment erosion and deposition patterns. Future studies may therefore use multiple measurement campaigns and evaluate the model’s ability to take into account temporally varying variables. Indeed, although the spatial variability of clogging may dominate over its temporal variability (Datry et al. 2015), the latter should not be neglected, particularly considering the importance of variables related to flow, such as stream power, as indicated in the literature (Naden et al. 2016). Such variables may exhibit high temporal variability from the flood event to the annual (i.e., low-flow and high-flow periods) scale, with possible consequences on the sediment erosion–deposition cycle and therefore riverbed clogging (Genereux et al. 2008).
4.2 First attempt to catchment extrapolation
The model showed good predictive performance when the river reaches of the three sampled catchments were included in the training and testing sets. This result indicated that the training sample was representative of the entire dataset, ensuring generalization capacities (Barbiero et al. 2020; Kernbach and Staartjes 2022) when data were available for each catchment in the training set. We then tested the model extrapolation ability across a catchment with no samples included in the initial training set. To this end, training configurations excluded one or two catchments from the training phase. Model performances were then tested on the same test set, containing samples from the three catchments (Table 5). As expected for this data-driven model, removing one of the catchments led to a drastic decrease in model accuracy. For instance, removing observations from the Cisse catchment, which had the lowest number of observations, led to the highest decrease in performance (accuracy = 24%). However, this decrease was similar for each of the three catchments, suggesting that the unbalanced distribution of observations among the three catchments did not significantly influence model performance. Similarly, considering only the Beuvron catchment, which had the highest number of measured river reaches, but the smaller agricultural area proportion to train the model, resulted in poor model performance (accuracy = 45%). When considering only the Cisse catchment, the decrease in performance was important but remained smaller (accuracy = 58%). As the Cisse catchment was mainly covered with cropland areas, this result underlined the importance of considering land use to predict fine sediment deposition intensity at the catchment scale, as already suggested by the analysis of variable importance and literature (e.g., Konrad and Gellis 2018; Davis et al. 2021).
Overall, these results illustrated that the incorporation of a limited number of observations from different catchments may lead to significant improvement in the robustness of model predictions. As it may not always be feasible to monitor the entire catchment river network, the proposed model may therefore be helpful to upscale an analysis conducted on a selection of well-monitored river reaches to the entire catchment scale with satisfactory accuracy. Preliminary work based on the framework proposed in this study would help design such measurement campaigns by selecting the most influential variables and representative river reach characteristics that should be monitored. It would help target the most relevant measurements that should be performed and reduce extensive monitoring efforts, such as the one performed in this study. Additional time may therefore be dedicated to specific measurements (e.g., quantitative clogging assessment; Descloux et al. 2010) or to multiple measurements over time of some representative river reaches to study deposit temporal dynamics in relationship with, e.g., variations in bed erodibility and sediment sources, which are known to exhibit significant temporal variability (Droppo et al. 2001; Grabowski et al. 2012; Haddad et al. 2022). Future work should benefit from such measurements and should study the complementarity between the model and existing national databases to prioritize mitigation strategies, to be designed in collaboration with stakeholders, considering both sediment supply from hillslopes and in-stream sediment dynamics (Wharton et al. 2017).
5 Conclusions
An extensive database, including fine sediment deposition measurements obtained from three temperate catchments, was acquired and analyzed. A statistical analysis suggested that fine sediment deposition is a nonlinear and multifactorial process. Measurements collected during this study were used to develop and evaluate a machine-learning model using a random forest approach. The results demonstrated that this modeling approach, which has very rarely been applied to the study of sediment dynamics, may be relevant to evaluate fine sediment deposition at the catchment scale.
Variable ranking was proposed to quantify the relative importance of the different factors affecting fine sediment deposition. In line with the results from the statistical analysis, the importance of the river channels characteristics as well as that of sediment source proximity, particularly in agricultural areas, was suggested. It indicated the importance of considering both hillslope and in channel processes when analyzing fine sediment deposition in rivers.
Future work should include other monitoring campaigns conducted in catchments located in more contrasting environments to evaluate the model representativeness. Our central hypothesis was that fine sediment deposition can be studied using a single measurement campaign. However, we recognize that temporal variations in hillslope and river reach processes, including barrier maneuvers, may influence sediment deposition. Measurements performed over various periods (e.g., low-flow versus base-flow periods and the effects of flood events) should therefore be considered. Improving the representation of hillslope processes in this modeling approach may improve our understanding of fine sediment transfers along the hillslope-to-river continuum.
The proposed approach may help improve our understanding and our capacity to predict fine sediment deposition in streams, which is key to understanding sediment dynamics at the catchment scale, particularly sediment (dis)connectivity. It may also be used for prioritizing the implementation of mitigation measures and/or the design of future monitoring campaigns, which should ultimately help decision-makers improve water quality to meet the objectives of environmental legislation, such as the Water Framework Directive in the European Union.
References
Altmann A, Toloşi L, Sander O, Lengauer T (2010) Permutation importance: a corrected feature importance measure. Bioinform 26(10):1340–1347
Archambaud G, Giordano L, Dumont B (2005) Description du substrat minéral et du colmatage. Technical Note. Cemagref Aix-En-Provence, UR Hydrobiologie 7
Arnold JG, Srinivasan R, Muttiah RS, Williams JR (1998) Large area hydrologic modeling and assessment—part 1: model development. J Am Water Resour Assoc 34(1):73–89
Baldan D, Piniewski M, Funk A, Gum**er C, Födl P, Hëfer S, Hauer C, Hein T (2020) A multi-scale, integrative modelling framework for setting conservation priorities at the catchment scale for the Freshwater Pearl Mussel Margaritifera margaritfera. Sci Total Environ 718:137369
Baldan D, Mehdi B, Feldbacher E, Piniewski M, Hauer C, Hein T (2021) Assessing multi-scale effects of natural water retention measures on in-stream fine bed material deposits with a modeling cascade. J Hydrol 594:125702
Barbiero P, Squillero G, Tonda A (2020) Modeling generalization in machine learning: a methodological and computational study. ar**v preprint ar**v:2006.15680h
Beven K, Freer J (2001) Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology. J Hydrol 249(1–4):11–29
Bieger K, Arnold JG, Rathjens H, White MJ, Bosch DD, Allen PM, Volk M, Srinivasan R (2017) Introduction to SWAT+, a completely restructures version of the Soil and Water Assessment Tool. J Am Water Resour Assoc 53(1):115–130
Bilotta GS, Brazier RE (2008) Understanding the influence of suspended solids on water quality and aquatic biota. Water Res 42:2849–2861
Blaschke P, Steiner KH, Schmalfuss R, Gutknecht D, Sengschmitt D (2003) Clogging processes in hyporheic interstices if an impounded river, the Danube at Vienna. Austria Intern Rev Hydrobiologia 88(3–4):397–413
Boano F, Harvey JW, Marion A, Packman AI, Revelli R, Ridolfi L, Wörman A (2014) Hyporheic flow and transport processes: mecanisms, models and biogeochemical implications. Rev Geophys 52:603–679
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Brunke M (1999) Colmation and depth filtration within streambeds: retention of particles in hyporheic interstices. Int Rev Hydrobiol 84(2):99–117
Cerdan O, Govers G, Le Bissonnais Y, Van Oost K, Poesen J, Saby N, Gobin A, Vacca A, Quinton J, Auerswald K, Klik A, Kwaad FJPM, Raclot D, Ionita I, Rejman J, Rousseva S, Muxart T, Roxo MJ, Dostal T (2010) Rates and spatial variations of soil erosion in Europe: a study based on erosion plot data. Geomorphol 122:167–177
Clapcott JE, Young RG, Harding JS, Matthaei CD, Quinn JM, Death RG (2011) Sediment assessment methods: protocols and guidelines for assessing the effects of deposited fine sediment on in-stream values. Cawthron Institute, Nelson, New Zealand. 108 pp. Available from https://www.envirolink.govt.nz/assets/R4-1-Sediment-Assessment-Methods-Protocol-and-guidelines.pdf. Last Accessed 10 Jan 2023
Collins AL, Walling DE (2007) The storage and provenance of fine sediment on the channel bed of two contrasting lowland permeable catchments, UK. River Res Applic 23:429–450
Conaway CH, Draut AE, Echols KR, Storlazzi CD, Ritchie A (2013) Episodic suspended sediment transport and elevated polycyclic aromatic hydrocarbon concentrations in a small, mountainous river in coastal California. River Res Applic 29(7):919–932
Conroy E, Turner JN, Rymszewicz A, Bruen M, O’Sullivan JJ, Kelly-Quinn M (2016) An evaluation of visual and measurement-based methods for estimating deposited fine sediment. Int J Sediment Res 31:368–375
Darst BF, Malecki KC, Engelman CD (2018) Using recursive feature elimination in random forest to account for correlated variables in high dimensional data. BMC Genet 19(1):65. https://doi.org/10.1186/s12863-018-0633-8
Datry T, Lamouroux N, Thivin G, Descloux S, Baudoin JM (2015) Estimation of sediment hydraulic conductivity in river reaches and its potential use to evaluate streambed clogging. River Res Applic 31:880–891
Davis NG, Hodson R, Matthaei CD (2021) Long-term variability in deposited fine sediment and macroinvertebrate communities across different land-use intensities in a regional set of New Zealand rivers. New Zealand J Mar Freshw Res. https://doi.org/10.1080/00288330.2021.1884097
Descloux S, Datry T, Philippe M, Marmonier P (2010) Comparison of different techniques to assess surface and subsurface streambed colmation with fine sediments. Int Rev Hydrobiol 95(6):520–540
Droppo IG, Ongley ED (1994) Flocculation of suspended sediment in rivers of southeastern Canada. Water Res 28(8):1799–1809
Droppo IG, Leppard GG, Flannigan DT, Liss SN (1997) The freshwater floc: a functional relationship of water and organic and inorganic floc constituents affecting suspended sediment properties. Water Air Soil Pollut 99:43–53
Droppo IG (2001) Rethinking what constitutes suspended sediment. Hydrol Process 15(9):1551–1564
Droppo IG, Lau YL, Mitchell C (2001) The effect of depositional history on contaminated bed sediment stability. Sci Total Environ 226:7–13
Droppo IG (2004) Structural controls on floc strength and transport. Can J Civ Eng 31:569–578
Droppo IG, Nackaerts K, Walling DE, Williams N (2005) Can flocs and water stable soil aggregates be differentiated within fluvial systems? Catena 60(1):1–18
Droppo IG, D’Andrea A, Krishnappan BG, Jaskot C, Trapp B, Basuvaraj LSN (2015) Fine-sediment dynamics: towards an improved understanding of sediment erosion and transport. J Soils Sediments 15:467–479
Dupeux G, Favreau Y (2017) Wide scale diagnosis of clogging of rivers. La Houille Blanche 6:25–26. https://doi.org/10.1051/lhb/2017053
Dyer KR (1989) Sediment processes in estuaries: future research requirements. J Geophys Res 94:14327–14339
Fetzer J, Holzner M, Plötze M, Furrer G (2017) Clogging of an Alpine streambed by silt-sized particles—insights from laboratory and field experiments. Water Res 126:60–69
Fryirs K, Brierley GJ (2001) Variability in sediment delivery storage along river courses in Bega catchment, NSW, Australia: implications for geomorphic river recovery. Geomorphol 38:237–265
Fryirs K (2013) (Dis)Connectivity in catchment sediment cascades: a fresh look at the sediment delivery problem. Earth Surf Proc Landforms 38(1):30–46
Fryirs KA (2017) River sensitivity: a lost foundation concept in fluvial geomorphology. Earth Surf Proc Landforms 42(1):55–70
Gayraud S, Philippe M (2003) Influence of bed sediment features on the interstitial habitat available for macroinvertebrate in 15 French streams. Int Rev Hydrobiol 88:77–93
Genereux DP, Leahy S, Mitasova H, Kennedy CD, Corbett DR (2008) Spatial and temporal variability of streambed hydraulic conductivity in West Bear Creek, North Carolina, USA. J Hydrol 358:332–353
Gob F, Bilodeau C, Thommeret N, Belliard J, Albert MB, Tamisier V, Baudoin JM, Kreutzenberger K (2014) Un outil de caractérisation hydromorphologique des cours d’eau pour l’application de la DCE en France (CARHYCE). Geomorphologie 20(1):57–72
Grabowski RC, Droppo IG, Wharton G (2010) Estimation of critical shear stress from cohesive strength meter-derived erosion thresholds. Limnol Oceanogr Met 8:678–685
Grabowski RC, Wharton G, Davies GR, Droppo IG (2012) Spatial and Temporal Variations in the Erosion Threshold of Fine Riverbed Sediments J Soils Sediments 12:1174–1188
Grangeon T, Droppo IG, Legout C, Esteves M (2014) From soil aggregates to riverine flocs: a laboratory experiment assessing the respective effects of soil type and flow shear stress on particles characteristics. Hydrol Process 28(13):4141–4155
Haddad H, Jodeau M, Legout C, Antoine G, Droppo IG (2022) Spatial variability of the erodibility of fine sediments deposited in two alpine gravel-bed rivers: The Isère and Galabre. Catena 212:106084
Hamshaw SD, Dewoolkar MM, Schroth AW, Wemple BC, Rizzo DM (2018) A new machine-learning approach for classifying hysteresis in suspended-sediment discharge relationships using high-frequency monitoring data. Water Resour Res 54(6):4040–4058
Hastie T, Tibshirani R, Friedman J (2009) Random forests. In Elem Stat Learn 587–604. Springer, New York, NY
Hooker G (2007) Generalized functional Anova diagnostics for high-dimensional functions of dependent variables. J Comput Graph Stat 38(4):66
Hooker G, Mentch L (2021) Bridging Breiman’s Brook: from algorithmic modeling to statistical learning. Obs Stud 7(1):107–125
Hooker G, Mentch L, Zhou S (2021) Unrestricted permutation forces extrapolation: variable importance requires at least one more model, or there is no free variable importance. Stat Comput 31(6):1–16
Hou Z, Scheibe TD, Murray CJ, Perkins WA, Arntzen EV, Ren H, Mackley RD, Richmond MC (2019) Identification and map** of riverbed sediment facies in the Columbia River through integration of field observations and numerical simulations. Hydrol Process 33(8):1245–1259
Hutson D (2014) Clogging of fine sediment within gravel substrates: macro-analysis and momentum-impulse model Thesis and Dissertations – Civil Engineering, University of Kentucky
Iooss B, Chabridon V, Thouvenot V (2022) Variance-based importance measures for machine learning model interpretability. https://hal.archives-ouvertes.fr/hal-03741384/document
Kefford BJ, Zalizniak L, Dunlop JE, Nugegoda D, Choy SC (2010) How are macroinvertebrates of slow flowing lotic systems directly affected by suspended and deposited sediments? Environ Pollut 158:543–550
Kemp P, Sear D, Collins A, Naden P, Jones I (2011) The impacts of fine sediment on riverine fish. Hydrol Process 25:1800–1821
Kernbach JM, Staartjes VE (2022) Foundations of machine learning-based clinical prediction modeling: part II—generalization and overfitting. Machine Learning in Clinical Neuroscience 134:15–21
Khullar S, Singh N (2021) Machine learning techniques in river water quality modelling: a research travelogue. Water Supply 21(1):1–13
Konrad C, Gellis A (2018) Factors influencing fine sediment on stream beds in the Midwestern United States. J Environ Qual 47:1214–1222
Krstajic D, Buturovic LJ, Leahy DE, Thomas S (2014) Cross-validation pitfalls when selecting and assessing regression and classification models. J Cheminformatics 6(10). https://doi.org/10.1186/1758-2946-6-10
Kuhn M (2020) Caret: classification and regression training R package version 60–86. https://cran.r-project.org/package=caret
Kursa MB, Rudnicki WR (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13
Lambert CP, Walling DE (1988) Measurements of channel storage of suspended sediment in a gravel-bed river. Catena 15:65–80
Landemaine V, Gay A, Cerdan O, Salvador-Blanes S, Rodrigues S (2015) Morphological evolution of a rural headwater stream after channelization. Geomorphol 230:125–137
Lau YL, Droppo IG, Krishnappan BG (2001) Sequential erosion/deposition experiments— demonstrating the effects of depositional history on sediment erosion. Water Res 35(11):2767–2773
Le Bihan M (2020) Methodologie d’évaluation de l’hydromorphologie des cours d’eaux en tête de bassin versant à l’échelle linéaire. Guide de l’Office Français de la Biodiversité, Direction Interrégionale Bretagne 36. http://atbvb.fr/sites/default/files/media/20200310_note_technique_tbv_v2.3_2.pdf. Last Accessed 13 Mar 2023
Legout C, Droppo IG, Coutaz J, Bel C, Jodeau M (2017) Assessment of erosion and settling properties of fine sediments stored in cobble bed rivers: the Arc and Isère alpine rivers before and after reservoir flushing. Earth Surf Proc Landforms 43(6):1295–1309
Liaw A, Wiener M (2002) Classification and regression by random forest. R News 2(3):18–22
Magbanua FS, Townsend CR, Hageman J, Piggott JJ, Matthaei CD (2016) Individual and combined effects of fine sediment and glyphosate herbicide on invertebrate drift and insect emergence: a stream mesocosm experiment. Freshw Sci 35(1):139–151
Maggi F (2005) Flocculation dynamics of cohesive sediment PhD Thesis, Delft University of Technology, 154 pp
Marmonier P, Delettre Y, Lefebvre S, Guyon J, Boulton AJ (2004) A simple technique using wooden stakes to estimate vertical patterns of interstitial oxygenation in the bed of rivers. Arch Hydrobiol 160:133–143
Martilla H, Kløve B (2014) Storage, properties and seasonal variations in fine-grained bed sediment within the main channel and headwaters of the River Sanginjoki, Finland. Hydrol Process 28:4756–4765
McKenzie M, England J, Foster IDL, Wilkes MA (2022) Abiotic predictors of fine sediment accumulation in lowland rivers. Int J Sediment Res 37:128–137
Milan DJ, Petts GE, Sambrook H (2000) Regional variations in the sediment structure of trout stream in southern England: benchmark data for siltation assessment and restoration. Aquat Conserv Mar Freshw Ecosyst 10:407–420
Misset C, Recking A, Legout C, Viana-Bandiera B, Poirel A (2021) Assessment of fine sediment river bed stocks in seven Alpine catchments. Catena 196:104916
Molnar C, Casalicchio G, Bischl B (2020) Interpretable machine learning—a bief history, state-of-the-art and challenges. In Koprinska I. et al. (eds) ECML PKDD 2020 Workshops. ECML PKDD 2020. Commun Comp Inform Sci 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_28
Naden PS, Murphy JF, Old GH, Newman J, Scarlett P, Harman CP, Duerdoth CP, Hawczak A, Pretty JL, Arnold A, Laizé C, Hornby DD, Collins AL, Sear DA, Jones JI (2016) Understanding the controls on deposited fine sediment in the streams of agricultural catchments. Sci Total Environ 547:366–381
Navratil O, Legout C, Gateuille D, Esteves M, Liebault F (2010) Assessment of intermediate fine sediment storage in a braided river reach (southern French Prealps). Hydrol Process 24:1318–1332
Nembrini S, König IR, Wright MN (2018) The revival of the Gini importance? Bioinformatics 34(21):3711–3718
Nogaro G, Mermillod-Blondin F, Montuelle B, Boisson JC, Bedell JP, Ohannessian A, Volat B, Gibert J (2007) Influence of a stormwater sediment deposit on microbial and biogeochemical processes in infiltration porous media. Sci Total Environ 377:334–348
Nogaro G, Datry T, Mermillod-Blondin F, Descloux S, Montuelle B (2010) Influence of streambed sediment clogging on microbial processes in the hypoheic zone. Freshw Biol 55:1288–1302
Owens PN, Walling DE, Leeks GJL (1999) Deposition and storage of fine-grained sediment within the main channel of the River Tweed. Scotland Earth Surf Process Landforms 24(12):1061–1076
Owens PN, Batalla RJ, Collins AJ, Gomez B, Hicks DM, Horowitz AJ, Kondolf GM, Marden M, Page MJ, Peackock DH, Petticrew EL, Salomons W, Trustrum NA (2005) Fine-grained sediment in river systems: environmental significance and management issues. River Res Applic 21:693–717
Pholkern K, Srisuk K, Gridhel T, Soares M, Schäfer S, Archwichai L, Saraphirom P, Pavelic P, Wirojanagud W (2015) Riverbed clogging experiments at potential river bank filtration sites along the ** River, Chiang Mai, Thailand. Environ Earth Sci 73:7699–7709
Platts WS, Megahan WF, Minshall GW (1983) Method for evaluating stream, riparian, and biotic conditions. U.S. Department of Agriculture, Forest Service, Intermountain Forest and Range Experiment station
Poeppl RE, Fryirs KA, Tunnicliffe J, Brierley GJ (2020) Managing sediment (dis)connectivity in fluvial systems. Sci Total Environ 736:139627
Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest Wiley Interdisciplinary. Rev Data Min Knowl Discov 9(3):e1301
R Core Team (2020) R: A Language and Environment for Statistical Computing Vienna, Austria. https://www.r-project.org/
Raven PJ, Holmes NTH, Dawson FH, Fox PJA, Everard M, Fozzard IR, Rouen KJ (2003) River habitat survey in Britain and Ireland Field Survey Guidance manual: version 2003. Bristol UK Environ Agency
Rehg KJ, Packman AI, Ren J (2005) Effects of suspended sediment characteristics and bed sediment transport on streambed clogging. Hydrol Process 19:413–427
Relyea CD, Minshall GW, Danehy RJ (2012) Development and validation of an aquatic fine sediment biotic index. Environ Manag 49:242–252
Ren H, Song X, Fang Y, Hou ZJ, Scheibe TD (2021) Machine learning analysis of hydrologic exchange flows and transit time distributions in a large regulated river. Front Artif Intell 4:648071
Rokach L (2010) Ensemble-based classifiers. Artif Intell Rev 33(1):1–39
Schälchli U (1992) The clogging of coarse gravel river beds by fine sediment. Hydrobiologia 235–236:189–197. https://doi.org/10.1007/BF00026211
Scheder C, Lerchegger B, Flödl P, Csar D, Gum**er C, Hauer C (2015) River bed stability versus clogged interstitial: depth-dependent accumulation of substances in freshwater pearl mussel (Margaritifera margaritifera L.) habitats in Austrian streams as a function of hydromorphological parameters. Limnologica 50:29–39
Sennatt KM, Salant NL, Renshaw CE, Magilligan FJ (2006) Assessment of methods for measuring embeddedness: application to sedimentation in flow regulated streams. J Am Water Resour Assoc 42:1671–1682
Seydell I, Ibisch R, Zanke U (2009) Intrusion of suspended sediments into gravel riverbeds: influence of bed topography studied by means of field and laboratory experiments. Adv Limnol 61:67–85
Shrivastava S, Stewardson MJ, Arora M (2020) Distribution of clay-sized sediments in streambeds and influence of fine sediment clogging on hyporheic exchange. Hydrol Process 34:5674–5685
Skurichina M, Duin RP (2002) Bagging, boosting and the random subspace method for linear classifiers. Pattern Analy Applic 5(2):121–135
Spencer KL, Wheatland JAT, Bushby AJ, Carr SJ, Droppo IG, Manning AJ (2021) A structure-function based approach to floc hierarchy and evidence for the non-fractal nature of natural sediment flocs. Sci Rep 11(1):14012
Stewardson MJ, Datry T, Lamouroux N, Perlla H, Thommeret N, Valette L, Grant SB (2016) Variations in reach-scale hydraulic conductivity of streambeds. Geomorphol 259:70–80
Strobl C, Boulesteix AL, Zeileis A, Hothorn T (2007) Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8(1):1–21
Sutherland AB, Culp JM, Benoy GA (2010) Characterizing deposited sediment for stream habitat assessment. Limnol Oceanogr Methods 8:30–44
Szymczak S, Holzinger E, Dasgupta A, Malley JD, Molloy AM, Mills JL, Bailey-Wilson JE (2016) r2VIM: A new variable selection method for random forests in genome-wide assoc studies. BioData Min 9(1):1–15
Taşar B, Kaya YZ, Varçin H, Üneş F, Demirci M (2017) Forecasting of suspended sediment in rivers using artificial neural networks approach. Int J Adv Eng Res Sci 4(12):237333
Torri D, Poesen J, Borselli L (1997) Predictability and uncertainty of the soil erodibility factor using a global dataset. Catena 31:1–2
Van Rijn LC (1993) Principles of sediment transport in rivers. Aqua publications, Amsterdam, Estuaries and Coastal Seas
Vericat D, Batalla RJ (2006) Sediment transport in a large impounded river: the lower Ebro. NE Iberian Peninsula Geomorphol 79(1–2):72–92
Wagenhoff A, Townsend CR, Philipps N, Matthaei CD (2011) Subsidy-stress and multiple-stressor effects along gradients of deposited fine sediment and dissolved nutrients in a regional set of streams and rivers. Freshw Biol 56:1916–1936
Walling DE, Moorehead PW (1989) The particle size characteristics of fluvial suspended sediment: an overview. Hydrobiol 176–177:125–149
Walling DE, Owens PN, Leeks GJL (1999) Rates of contemporary overbank sedimentation and sediment storage on the floodplains of the main channel systems of the Yorkshire Ouse and River Tweed, UK. Hydrol Process 13:993–1009
Walling DE, Owens PN, Waterfall BD, Leeks GJL, Wass PD (2000) The particle size characteristics of fluvial suspended sediment in the Humber and Tweed catchments, UK. Sci Total Environ 251–252:205–222
Waters TF (1995) Sediment in streams: sources, biological effects and control. Am Fish Soc Betheda pp 1–251
Wendling V, Legout C, Gratiot N, Michallet H, Grangeon T (2016) Dynamics of soil aggregate size in turbulent flow: respective effect of soil type and suspended concentration. Catena 141:66–72
Wharton G, Mohajeri SH, Righetti M (2017) The pernicious problem of streambed colmation: a multi-disciplinary reflection on the mechanisms, causes, impacts and management challenges. WIREs Water 4:e1231
Wood PJ, Armitage PD (1997) Biological effects of fine sediment in the lotic environment. Environ Manag 21(2):203–217
Wright MN, Ziegler A (2017) ranger: A fast implementation of random forests for high dimensional data in C++ and R. J Stat Softw 77:1–17
Wright MN, Dankowski T, Ziegler A (2017) Unbiased split variable selection for random survival forests using maximally selected rank statistics. Stat Med 36(8):1272–1284
Zweig LD, Rabeni CF (2001) Biomonitoring for deposited sediment using benthic invertebrates: a test on 4 Missouri streams. J N Am Benthol Soc 20:643–657
Acknowledgements
The editor and reviewers provided very constructive and detailed comments that helped to improve the manuscript quality. This study was funded by the Loire-Brittany Water Agency, in the framework of the METEOR project, under the supervision of Xavier Bourrain, Jean-Noël Gautier, and Anne Colmar. TG would like to thank Sébastien Gourdier (BRGM) for providing an internal funding that helped finalizing and revising the manuscript.
Author information
Authors and Affiliations
Contributions
TG, CG, RV, and OC conceptualized the study. YF and GD performed the field measurements and created and harmonized the database. TG and RV performed the GIS calculations, data analysis, and preprocessed the database for modeling. CG developed the random forest model, with support from JR. TG and CG analyzed the model results and wrote the initial draft. All co-authors commented the initial draft and the revised manuscript. OC, RV, OE, and SSB secured funding and were involved in project management.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Responsible editor: Geraldene Wharton
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Highlights
• A total of 11,302 river reaches were assessed for their morphology and fine sediment deposition intensity across three contrasting temperate catchments.
• The dataset was used to develop and evaluate a random forest model.
• The model predicted fine sediment deposition intensity with an accuracy of 81%.
• In-stream variables as well as land use significantly influenced fine sediment deposition.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Grangeon, T., Gracianne, C., Favreau, Y. et al. Catchment-scale variability and driving factors of fine sediment deposition: insights from a coupled experimental and machine-learning-based modeling study. J Soils Sediments 23, 3620–3637 (2023). https://doi.org/10.1007/s11368-023-03496-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11368-023-03496-w