1 Introduction

Hillslope erosion can include a significant proportion of fine sediment particles (< 2 mm diameter; Vericat and Batalla 2006). In rivers, fine particles can result in multiple environmental issues, such as increased river turbidity and streambed clogging (Brunke 1999), which are detrimental to the aquatic environment (Owens et al. 2005) and preclude the achievement of the Water Framework Directive objectives (Bilotta and Brazier 2008). Fine sediment deposition may reduce riverbed infiltration capacity, which in turn affects the hyporheic zone dynamics by modifying the water flow (Boano et al. 2014). Moreover, particle transfer can contribute to the transport of contaminants, such as heavy metals and polycyclic aromatic hydrocarbons, as they may have a high affinity for fine particles (Conaway et al. 2013). Understanding fine sediment deposition as part of river sediment dynamics (Waters et al. 1995) is therefore of great importance, as it may also affect sediment-associated chemical (Droppo et al. 2015), biogeochemical (Nogaro et al. 2007) and microbial (Nogaro et al. 2010) processes and dynamics, as well as aquatic life (Wood and Armitage 1997; Scheder et al. 2015), including fish (Kemp et al. 2011) and invertebrates (Kefford et al. 2010; Wagenhoff et al. 2011; Magbanua et al. 2016).

Fine sediment storage and erosion in river channels and floodplains have been found to be a significant part of the catchment sediment budget in various environments, including lowland (Owens et al. 1999; Walling et al. 1999; Collins and Walling 2007; Marttila and Kløve 2014) and mountainous catchments (Navratil et al. 2010; Misset et al. 2021). It underlines the need to study this process further to improve our understanding of sediment (dis)connectivity in catchments (Fryirs 2013). The deposition and accumulation of fine particles within river reaches may turn riverbeds into a significant supply of fine sediment to downstream environments (Fryirs and Brierley 2001), which may be mobilized during flood events (Droppo 2004), making sediment deposition an important process for understanding fine sediment dynamics at the catchment scale. However, several issues limit our ability to understand and model fine particle deposition. Multiple techniques have been proposed in the literature to estimate sediment deposition and subsequent riverbed clogging, such as visual estimations (Owens et al. 1999; Zweig and Rabeni 2001), embeddedness (Platts et al. 1983), resuspension method (Lambert and Walling 1988), sediment shear strength measures (Grabowski et al. 2010; Legout et al. 2017), sediment coring (Milan et al. 2000), wooden stakes (Marmonier et al. 2004), dynamic penetrometry (Landemaine et al. 2015), sediment trap measurements (Seydell et al. 2009), shuffle index (Clapcott et al. 2011), or infiltration capacity measurements (Datry et al. 2015). Each of these techniques has specific advantages (Descloux et al. 2010), but their respective results cannot be compared, as they target different proxies of sediment deposition (e.g., resistance to shear stress, oxygenated depth, and infiltration capacity). Measurements are also labor- and time-consuming, limiting large-scale evaluations, particularly considering that fine sediment deposition can exhibit high spatial variability at the catchment scale (Haddad et al. 2022). Such requirements and variations in measurement protocols may explain the lack of standardized, in-stream, monitoring programs (Wharton et al. 2017), limiting our ability to describe this complex process.

Once particles enter river channels, suspended particle concentrations and flow shear stress affect particle characteristics (Grangeon et al. 2014; Wendling et al. 2016), particularly particle size (Dyer 1989; Maggi 2005), turning soil aggregates into complex organo-mineral composites usually referred to as flocs (Droppo et al. 1997; Droppo 2001; Spencer et al. 2021) and changing their depositional dynamics (Droppo and Ongley 1994; Droppo et al. 2005). After particle deposition on the bed surface, Brunke (1999) described clogging occurrence through particle infiltration and retention in the bed, resulting in pore bridging and particle retention. This reduces bed permeability (Schälchi 1992) and porosity (Gayraud and Philippe 2003) and creates a clogging depth, ultimately leading to sediment-column clogging. Experimental studies have demonstrated that multiple factors may affect fine sediment deposition and riverbed clogging, including the suspended-to-matrix particle size and shape (Hutson 2014), suspended particle concentration (Pholkern et al. 2015), particle characteristics (Rehg et al. 2005), hydraulic gradient between river and groundwater (Schälchi 1992), and flow conditions (Fetzer et al. 2017). This may also depend upon the successive occurrence of flood events (Blaschke et al. 2003) and the depositional history (Lau et al. 2001). However, the upscaling of this research to the catchment scale remains limited, which hampers the conceptualization and representation of fine sediment deposition and potential clogging in catchment-scale models (e.g.Arnold et al. 1998; Bieger et al. 2017) that may be used to improve our understanding of the sediment cascade and design relevant mitigation strategies to limit the deleterious effects of fine sediment deposition. Such models indeed face the issues of model parameterization, including the difficulty of adequately representing the nonlinear interactions of river and hillslope processes and the subsequent interactions between physical and biological factors in river channels (Shrivastava et al. 2020).

Field-based studies performed over multiple river reaches can help determine the controlling factors of particle deposition and parameterizing models. Indeed, they implicitly consider multiple processes affecting sediment deposition, evaluated over numerous river reaches and conditions. Datry et al. (2015) analyzed clogging through hydraulic conductivity in more than 100 river reaches in France over 2 years, both in winter and summer, to prioritize the factors significantly affecting riverbed clogging. Interestingly, one of their main results was that clogging varied more over the different monitored reaches than over seasons. Naden et al. (2016) analyzed fine sediment deposition in 230 catchments in England and Wales and found that stream power and flow velocity exhibited a significant relationship with deposited sediment density in streams saturated with fine particles, suggesting that deposited sediment dynamics were controlled by in-stream dynamics. In their study, the sediment pressure, mostly from agriculture, was also significant, as also indicated by Sutherland et al. (2010) and Konrad and Gellis (2018). In another large-scale study, Stewardson et al. (2016) developed a model of clogging based on hydraulic conductivity data acquired from 153 river reaches. They showed that the river reach geometry, catchment characteristics, and stream power significantly controlled riverbed clogging.

The different controlling factors proposed in these studies also highlights the potentially highly variable dynamics of fine sediment deposition and clogging, resulting from both in-stream and hillslope processes that may vary both between and within catchments. The significant relationships as well as model predictions presented by Naden et al. (2016) included scattering between modeled and measured values over more than one order of magnitude. Similarly, the model by Stewardson et al. (2016) explained 30% of the variance in external cross-validation, suggesting that these relationships could provide interesting variation trends, although they cannot be used as reliable predictors, limiting extrapolation possibilities.

Machine-learning models provide alternative approaches that can extract knowledge and draw inferences from data. As data-driven methods, they are implicitly able to consider multiple and potentially nonlinear relationships between variables. They have been successfully used to predict water quality, such as dissolved species content (Khullar and Singh 2021) but also for sediment spatial distribution and temporal variation modeling (Taşar et al. 2017; Harmshaw et al. 2018; Hou et al. 2019; Ren et al. 2021). Recent studies such as those conducted by Baldan et al. (2020) and Baldan et al. (2021) demonstrated the potential of modeling cascades to analyze sediment dynamics at the catchment scale. These studies made use of physically based catchment and river-reach models to feed a machine-learning model, with promising results. However, such an approach requires extensive datasets as model inputs to make accurate predictions, which may limit their applicability to numerous river reaches and extrapolation possibilities. To the best of our knowledge, the combined use of field-based monitoring and machine-learning approaches for modeling fine sediment deposition and the associated model performance and extrapolation possibilities have not been addressed in previous research.

In this study, we analyzed the spatial variability in fine sediment deposition at the catchment scale and attempted to gain insights into the driving factors, considering both hillslopes and in channel processes. To this end, an extensive and unique dataset, including the entire river network of three temperate catchments, was collected and analyzed. It was then used to develop a random forest model. To evaluate the model potential for application in unmonitored catchments, the model extrapolation performance was tested. This original modeling approach successfully predicted fine sediment deposition over the three monitored catchments and may therefore contribute to the understanding and management of sediment (dis)connectivity in catchments (Poeppl et al. 2020).

2 Materials and methods

2.1 Catchment location and characteristics

In this study, we considered three different catchments located in different French regions (Fig. 1). We hypothesized that variations in land use would induce differences in fine sediment supply to river channels. Consequently, two nearby catchments with contrasting land uses were selected for this study: the Beuvron and Cisse catchments are located in central France and are mostly covered with forested (72% of the Beuvron catchment) and agricultural areas (61% of the Cisse catchment), respectively. The third catchment presented a mix of land uses and was located in a different geological context: the Loisance catchment is located in Brittany, in western France. The lithology mainly consists of Proterozoic magmatic rocks (83%) in the Loisance catchment, while the Beuvron and Cisse catchments were mainly covered by Cenozoic sedimentary layers: clay and sands (92%) and limestone (80%), respectively. Land use was determined using Corine Land Cover. National crop statistics indicated that the agricultural lands in the Beuvron and Cisse catchments mainly corresponded to cereals (51% and 82%, respectively), such as wheat, maize, and barley. Of note, 43% of the agricultural lands in the Loisance catchment consisted of grasslands, while grasslands covered 20% and 3% of the Beuvron and Cisse agricultural lands, respectively. Catchment morphological characteristics were calculated using a 25-m resolution digital elevation model (DEM). The main catchment characteristics are summarized in Table 1.

Fig. 1
figure 1

Location and elevation of the studied catchments within the Loire Brittany River basin, France: the a Beuvron, b Cisse, and c Loisance catchments. The dots in the right panel correspond to the catchment centroids

Table 1 Main characteristics of the studied catchments

2.2 Fine sediment deposition measurements

In this study, we focused on fine sediments, defined as particles with diameter finer than 2 mm (Walling and Moorehead 1989; Walling et al. 2000). Evaluating fine sediment deposition over the entire river network of the three different catchments required a compromise between data accuracy and the duration of the monitoring campaigns. We therefore did not follow a systematic approach but rather considered a river reach as a homogeneous entity in the field, delineated following expert knowledge and observations, following the method proposed by Dupeux and Favreau (2017), adapted from Archambaud et al. (2005). Consequently, in this study, the measured reach lengths varied; the reach length quantiles of 10%, 50% (median) and 90% were 39 m, 121 m, and 332 m, respectively. It is therefore acknowledged that some reaches may exhibit local, although limited, variations in their characteristics.

Fine sediment deposition was visually estimated based on the sediment areal coverage and water turbidity following manual stirring (Dupeux and Favreau 2017), which is similar to the shuffle index proposed by Clapcott et al. (2011). It differs from classical methods such as the one from Lambert and Walling (1988) or Navratil et al. (2010) in that it allows for a quick estimate of sediment deposition intensity, although only providing semi-quantitative results as suspended sediment concentration or turbidity is not measured. The sediment deposition intensity was divided into four classes: 0%, 25–50%, 50–75%, and 75–100% (Fig. 2). Although visual assessment methods were questioned in regard to their use in the quantitative assessment of sediment deposition intensity (Sennatt et al. 2006), recent studies have demonstrated that they may provide an appropriate method for deposit estimates (Conroy et al. 2016), as they was demonstrated to correlate well with quantitative estimates of sediment stocks (McKenzie et al. 2022) while providing quick estimates. Finally, surface stirring ensured that the extreme classes were reliably estimated and did not correspond to a thin layer of surficial clogging. However, it is acknowledged that the 25–50% and 50–75% classes may be difficult to discriminate in the field. Despite its drawbacks, this method might be among the few that may be applied to studies that include the entire river network with multiple catchments.

Fig. 2
figure 2

Illustrations of different sediment deposition intensity classes, in increasing order, as estimated in the field. a No fine sediment (0%), b weak intensity (25–50%), c moderate intensity (50–75%), and d high intensity (75–100%)

2.3 Dataset collection

In addition to fine sediment deposition intensity, multiple measurements were performed for the river reaches, including river geometry and flow conditions during the field campaign. We based variable selection on previous literature results and a national protocol (Gob et al. 2014). It is expected that the corresponding dataset will grow over time, which would provide interesting opportunities for model extrapolation. The monitoring protocol related to in channel variables is similar to the protocol proposed in Raven et al. (2003) and Clapcott et al. (2011), and is summarized in McKenzie et al. (2022).

It is hypothesized that stream power, depending upon channel geometry and channel slope, may be a variable controlling sediment deposition and erosion variability (Naden et al. 2016). Consequently, the main morphological characteristics of river reaches controlling the local hydraulics (Van Rijn 1993) were measured on the field through transects (reach width, depth, length) or calculated (reach slope, sinuosity). As Fetzer et al. (2017) suggested that flow condition is an important controlling variable, it was also visually estimated using previously established classes (Clapcott et al. 2011). It was assumed that habitat diversity might be a synthetic indicator of degraded river reaches, which may therefore be used as a proxy of river dynamics. Habitat diversity was estimated as a relative measure on each river reach. During the field survey, the relative proportion of wood debris (length > 30 mm), cobbles (width > 128 mm), plant roots, aquatic vegetation, and organic litter density was estimated for each river reach. Habitat diversity was then classified into 4 different classes, ranging from “none,” in the absence of habitat on the river reach, to “high” when both the density and variety of habitats was important (Le Bihan 2020). Finally, the visual assessment of the underlying bed substrate was found to correlate well with sediment deposited mass (Naden et al. 2016). It was therefore evaluated by walking up in the river reaches and estimating the dominant substrate type, in the same way as described in McKenzie et al. (2022).

In addition to within-channel dynamics, hillslope dynamics should be considered as potential sediment sources (Sutherland et al. 2010; Wagenhoff et al. 2011; Davis et al. 2021). In particular, agricultural areas may supply important fine sediment quantities to river reaches and should therefore be considered (Konrad and Gellis 2018). Therefore, the lower end of each river reach was considered the outlet of a subcatchment. For each of these subcatchments, the corresponding upstream surface area and the land-use proportions in the drainage area were calculated.

Finally, physical barriers such as weirs were observed on some of the river reaches (4.7% of the total river reaches). Such obstacles were assumed to affect the flow and the within-reach sediment dynamics. They were therefore included in the dataset.

Fifteen variables were therefore compiled from 11,302 river reaches. The data acquired, analyzed, and used for modeling are summarized in Table 2. A correlation matrix was calculated using Pearson’s r and Spearman’s ρ coefficients to analyze the variables that correlated with deposition intensity and to draw hypotheses on the factors affecting fine sediment deposition. The correlation matrix is provided as supplementary material (Table S1).

Table 2 Estimated, measured, or calculated variables used in the model. As subcatchments may contain different land uses, each land use was considered a different variable

Important temporal variations in fine sediment deposition may occur in river reaches depending on the hydrological regime, particularly the occurrence of flood events (Genereux et al. 2008). The monitoring was performed mainly in spring and autumn, when limited water height variations were expected while maintaining a sufficient water level for analysis (McKenzie et al. 2022). However, acquiring this extensive dataset required intensive fieldwork: one day was required to measure approximately 10 km. A single measurement campaign was therefore available for analysis in this study (i.e., no temporal variations could be studied). As temporal variations may not always be the most important factor driving fine sediment deposition variability (Datry et al. 2015), it was assumed that the dataset acquired in this study may provide reliable information to analyze fine sediment deposition.

2.4 Prediction of sediment deposition intensity classes

In this study, fine sediment deposition was modeled using a machine-learning-based approach. Machine-learning models are able to autonomously build the relationships between the input and output variables, which is an important advantage when modeling complex processes such as sediment deposition. Among these algorithms, random forest was used because of its ability to address high-dimensional datasets, including a mixture of numeric and categorical variables. It is able to address nonlinear relationships, as well as correlated features, while maintaining good predictive performance (Skurichina and Duin 2002; Darst et al. 2018). Moreover, random forest has fewer hyperparameters (i.e., parameters controlling the learning process) than alternative machine-learning techniques. Data processing and random forest training were performed using the “caret” (Kuhn 2020) and “ranger” (Wright and Ziegler 2017) packages of R software v4.0.0 (R Core Team 2020).

Comparison exercises such as Hastie et al. (2009) and Probst et al. (2019) have shown the importance of tuning the number of variables randomly chosen in the learning procedure among all considered predictors in each split (designed by mtry) and the node size. The small number of parameters required by the algorithm reduces potential overlearning issues, computational costs, and optimization requirements. This is also an important benefit relative to physically based models, which usually face the issue of important parameter requirements, with associated equifinality and uncertainty issues (Beven and Freer 2001). Random forests also have the advantage of being able to estimate the relative importance of the variables.

However, two main challenges are worth pointing out. First, our dataset is relatively imbalanced (i.e., some classes are more frequent than others), which can pose some difficulties for classification models. This means that other machine- and/or deep-learning techniques (e.g., neural networks) may have provided better prediction performance or may have needed a smaller dataset to achieve similar prediction performance. However, these alternatives may not have necessarily provided variable importance that were analyzed to improve the interpretability of the proposed machine-learning-based results, which is a current research question (Molnar et al. 2020). Second, although of great practical importance, the question of variable importance is still a very active area of research (e.g., Iooss et al. 2022) because, depending on the context of the study (e.g., size of the training dataset, number of predictors, and dependence among them), there is currently no consensus on which approach to use. As an attempt to address this problem, we propose in Sect. 2.4.3 to apply multiple methods and to retain in the analysis only the results that were consistent between the different results.

From a more technical point of view, as a tree-based ensemble algorithm, random forest uses a large number of individual, unpruned decision trees to classify weak classifiers (Breiman 2001). Each decision tree was grown using a subset of training data built by random sampling among samples and variables. A random selection of samples by bootstrap aggregating favors the stability of classifiers with good performances. The selection of random subsets of variables ensured correlation reduction and increased the robustness of the results. Then, branching point selections were built considering the best split among those variables at each node (Rokach 2010). This process was repeated until each branch end contained less than a prespecified number of observations. After tree partitioning was completed, the classification of a new sample was performed considering all decision trees using a majority vote. In this setting, overfitting was avoided by growing many trees during the learning process.

2.4.1 Data preprocessing

Deposition intensity was initially divided into four distinct classes: no sediment deposition (0%) and weak (25–50%), moderate (50–75%), and high (75–100%) sediment deposition intensity. A preliminary modeling attempt revealed that the model had difficulties discriminating between the 25–50% and the 50–75% classes. This result was consistent with the inherent difficulty in estimating the medium deposition intensity classes in the field (Sect. 2.2). Consequently, these classes were merged into a single class of intermediary deposition intensity to improve prediction performance. Three classes were therefore considered for prediction purposes: 0%, 25–75%, and 75–100%, referred to as “no sediments,” “medium deposition intensity,” and “maximal deposition intensity,” respectively. When variables had missing values (3% to 14%, only numerical values), they were replaced by the weighted average of nonmissing observations using the imputation method implemented within the “randomForest” R package (Liaw and Wiener 2002).

2.4.2 Model tuning and evaluation

The dataset was randomly divided into a training set (70% of the initial dataset) and a test set (30% of the initial dataset), and each set had the same proportion of the three deposition intensity classes. Hyperparameter estimation was based on a stratified, nested k-fold, cross-validation procedure (with k-folds = 5) to limit overfitting (Krstajic et al. 2014). Following Breiman (2001), three values were tested for mtry (2, 4, and 8) and three values for node size (1, 3, and 10). The final random forest model was trained by setting mtry to 8 and node size to 1 and using default parametrization for all other parameters.

Model performance was assessed considering several indicators computed on the test set: accuracy, precision, and recall. They were defined as follows:

$${Accuracy}=\frac{TP+TN}{TP+FN+TN+FP}$$
(1)
$${Precision}=\frac{TP}{TP+FP}$$
(2)
$${Recall}=\frac{TP}{TP+FN}$$
(3)

where TP and TN denote true positives and true negatives, respectively, and FP and FN denote false positives and false negatives, respectively. These values were presented in a confusion matrix (Table S2) and illustrated the performances of the model for each class of deposition intensity. The values of these three indicators vary between 0 and 1. The closer the value is to 1, the better the model performance. Accuracy is a global measurement of the algorithm’s ability to predict deposition intensity. In this study, a light imbalance was observed among the three classes of deposition intensity with a ratio of 3:12:4 for no sediments, medium deposition intensity and maximal deposition intensity. This unequal class distribution may bias the interpretation of accuracy. Consequently, the model performance was also evaluated using precision and recall. Precision corresponded to the proportion of reaches for which the predicted deposition intensity was that observed in the field among all reaches associated with this deposition intensity by the model. Recall indicated the proportion of reaches for which the predicted deposition intensity was truly that observed in the field among all reaches that truly had this deposition intensity. Precision can thus be understood as a measurement of quality, while recall is a measure of completeness. These two metrics were calculated by considering the test set as a whole but also then distinguishing the results obtained for each catchment.

2.4.3 Variable importance

Variable importance in the random forest model is computed to determine the relative influence of each predictor in the final prediction. Several studies have indicated that variable importance measures may exhibit different flaws in the case of correlated features in a dataset (Hooker 2007; Hooker et al. 2021). We thus used several measures of variable importance based on different computing procedures. First, we considered the mean decrease in impurity (MDI), which calculates each feature importance as the total decrease in node impurity, measured by the Gini index, averaged over all trees of the forest. As this measure is known to be biased toward covariates with many possible partitions, such as categorical variables (Strobl et al. 2007; Wright et al. 2017), we considered its unbiased counterpart, actual impurity reduction (AIR) (Nembrini et al. 2018). We then applied the testing procedure of Altmann et al. (2010) to produce a p-value for each variable importance measure. Both were implemented in the R package “ranger.” An alternative to Gini importance is the use of permutation importance approaches based on dataset row or column permutation. They do not suffer from the bias of Gini importance (Szymczak et al. 2016) but may overestimate the importance of correlated variables (Hooker and Mentch 2021). We thus considered the methodology of Kursa and Rudnicki (2010) based on the addition of pseudo variables (called shadow attributes) in the model, implemented in the R package “boruta”. This procedure also includes statistical testing procedures for variable importance measures, which identify significantly important variables at the p-value scale.

Given that no standard method has been defined in the literature to compute variable importance (Iooss et al. 2022) and the uncertainties associated with the AIR and Boruta approaches, only those consistent results between both methodologies are discussed.

3 Results

3.1 Field measurements and statistical analysis

An overview of the measured deposition intensity is provided in Table 3. Due to differences in catchment area, the proportion of measured river reaches was higher in the Beuvron catchment (59.5%), followed by the Loisance (24.8%) and Cisse (15.7%) catchments. Most monitored river reaches had weak deposition intensity (37%), followed by moderate (27%) and high deposition intensity (21%) and no sediment deposition (15%). The measured fine sediment deposition exhibited contrasting behavior between and within catchments, associated with significant spatial variability (Fig. 3).

Table 3 Distribution of measured river reaches among catchments and sediment deposition intensity classes within the whole dataset
Fig. 3
figure 3

Overview of the measured deposition intensity in the studied catchments: a Cisse, b Loisance, and c Beuvron

Most Cisse catchment (Fig. 3a) upstream reaches were connected to agricultural fields, while some were located close to forested areas in the central part of the catchment. In this catchment, moderate to high fine sediment deposition intensity was measured in 70% of the reach lengths. Conversely, the other 30% mostly corresponded to upstream river reaches located in the forested part of the catchment.

In the Loisance catchment (Fig. 3b), a smaller proportion of river sections with deposition intensity higher or equal to moderate deposition intensity was measured. Indeed, 51% of the river reaches displayed no sediments or weak fine sediment deposition. Important local variations were measured with, e.g., alternations of high deposits and no sediments on successive river reaches.

In the Beuvron catchment (Fig. 3c), no clear deposition tendencies were observed in relation to land use. Indeed, weak and high deposition intensities were observed even in the most agricultural part of this catchment. However, in this catchment, the largest analyzed in this study (Table 1), fine sediment deposition exhibited a pattern of increased deposition from upstream to downstream areas, while first-order reaches exhibited a high number of river reaches with fine sediment deposition intensity higher or equal to moderate intensity. Indeed, 56% of the total reach length exhibited moderate to high deposition intensity on low Strahler-order reaches (lower than 3), suggesting that first- and second-order streams were more susceptible to fine sediment deposition. In the higher Strahler-order reaches (higher than 3), a progressive increase from weak sediment deposition intensity in the upper reaches to moderate and high deposition intensity in the lower reaches was observed.

This observation was in line with the correlation matrix results, indicating that the variables that correlated best with clogging intensity were the channel width (r = 0.4, ρ = 0.3) and Strahler order (r =  −0.4, ρ =  −0.4) (Fig. 4). In our study, the autocorrelation between the Strahler order and channel width was low (r =  −0.2, ρ = 0.1), which can be explained by the investigation of the three different catchments in the dataset.

Fig. 4
figure 4

Relationship between sediment deposition intensity and a channel width and b Strahler order

It was not surprising, given the observed variations in fine sediment deposition within the three studied catchments, that only weak relationships were found between deposition and the local variables. It is, however, interesting to note that some variables correlated with fine sediment deposition intensity, particularly when considering each catchment separately (Fig. 5). Indeed, there was, for instance, no relationship between fine sediment deposition intensity and slope (r and ρ were < 0.1) using the entire dataset (Fig. 5a) and those for the Beuvron and Cisse catchments (Fig. 5b). However, the relationship between deposition intensity and slope was significant at the 5% level of probability for the Loisance catchment (r =  −0.3, p =  −0.3), suggesting that the local river reach characteristic variations changed local fine sediment deposition dynamics in this catchment.

Fig. 5
figure 5

Relationship between sediment deposition intensity and river reach slope using a the entire dataset and b each catchment separately

3.2 Modeling fine sediment deposition

The global accuracy of the random forest model was 81% on the test set, demonstrating the strong predictive capabilities of machine-learning models to predict deposition intensity. The model performed well at predicting the different deposition intensity levels, as the precision was higher than 80% for all deposition intensity classes (Table 4). The exception was the 46% precision for prediction of the “no sediment” class of the Cisse catchment. However, this class only represented 120 river reaches (6.7% of the catchment river network). Consequently, the data-driven model presented in this study had difficulties predicting a small number of river reach conditions. Moreover, the confusion matrix (Table S2) indicated that the model tended to predict the medium deposition intensity class (25–75%) more often than observed, which may be related to the higher representation of this class in the initial data (Table 3). The global recall (70%) was also reasonable, but with significant between-class differences. For example, the medium deposition intensity class had a high precision and recall (81% and 93%, respectively), showing good model predictive ability. However, the “no sediment” class had high precision but poor recall (80% and 48%, respectively), indicating that the model adequately predicted no sediments on a river reach, but it was not able to retrieve more than half of the reaches with no sediment deposition. This result suggested that the no sediment class was less well-described by the dataset. The between-catchment differences in model performance may be partly related to their unbalanced representation in the whole dataset, as 59.5% of the data were from Beuvron, 15.7% were from Cisse, and 24.8% were from Loisance (Table 3).

Table 4 Global and per class performances of the random forest model

The global model performance was promising. It was therefore assessed whether the model was able to reproduce the spatial fine sediment deposition patterns. The predicted values matched the measured pattern of fine sediment deposition in the three studied catchments reasonably well (Fig. 6). Interestingly, the model was able to capture the progressive deposition increase occurring in downstream directions along reaches (e.g., from east to west in the Beuvron catchment), as well as the high deposition intensity measured in first-order reaches. These results demonstrated that the proposed model was also able to predict the observed fine sediment deposition spatial organization. It may therefore provide an interesting tool to assess the dynamics of this highly heterogeneous process, provided sufficient relevant data are available in the training set.

Fig. 6
figure 6

Modeling results on the three studied catchments for the test set. River reaches in green correspond to sediment deposition intensity that was adequately predicted by the model. River reaches in red indicate an error in model prediction. Missing reaches were those included in the training set

Indeed, extensive databases such as those acquired in this study may not always be available to apply data-driven models such as the one proposed in this study. We therefore assessed whether the proposed approach could be applied in other contexts with fewer measurements. Several models were built considering various sizes of training data, from 1 to 70%, using 7% increments, from the training set (70% of initial data) (Fig. 7). The performances of these models were then computed using the test set (30% of initial data), allowing model comparison. The model accuracy increased from 59 to 81% with an increasing number of river reaches used for modeling. Being a data-driven model, this increase in model performance with increasing dataset size was not surprising. However, it is worth noting that using only 8% of the training set resulted in a 70% accuracy, which may be considered an acceptable result. The model had a better performance when only considering the Beuvron catchment than when considering the three catchments together, which was probably related to the higher number of river reaches analyzed in this catchment. Similarly, lower values of precision and recall for the Cisse catchment were probably related to the lower number of river reaches observed in this catchment. Accuracy was lower for the Loisance catchment (76%), which may be consistent with the observed highly spatially heterogeneous data measured in this catchment (Fig. 3b). Overall, these results indicated that most dataset variability was captured using a dataset including 8% to 15% of the catchment river reaches, corresponding to approximately 630 to 1190 river reaches measured across the three catchments, with a good model performance.

Fig. 7
figure 7

Performance of the random forest model trained with increasing amounts of data for a accuracy and b precision and recall (dashed and dotted lines, respectively). The black line (“Overall”) corresponds to scores obtained considering all catchments

4 Discussion

4.1 Driving factors of fine sediment deposition

Given that the model performance can be considered acceptable for various deposition classes, locations, and contrasted river reaches, it was used to determine the hierarchy of the factors controlling fine sediment deposition through the computation of their relative importance.

Bed substrate granularity, flow condition, reach depth and width, cropland cover proportion, and forest and grassland cover proportions were the six most influential variables in regard to deposition intensity regardless of the considered methodology (Figs. S1 and S2). Interestingly, the presence of barriers on the river reaches did not provide a significant explanatory variable. This result suggested either that the barriers did not affect a sufficient number of reaches or that the corresponding reaches were characterized by configurations that would have resulted in fine sediment deposition even in the absence of barriers (e.g., limited slopes and high particle contributions from hillslopes).

The importance of bed substrate granularity, previously noted by Sutherland et al. (2010), may reflect at least two different processes (Shrivastava et al. 2020). The suspended particle size, compared to the size of particles deposited on the riverbed, may increase or limit their accumulation on the riverbed because of pore bridging (Brunke 1999) and associated decreased bed permeability, further consolidating the bed. It will therefore contribute to the persistence or absence of fine sediment deposition (Fetzer et al. 2017). The underlying bed substrate may also reflect the underlying material and therefore its erodibility, as the texture is usually correlated with soil erodibility in the literature (Torri et al. 1997). This hypothesis would be consistent with the model variable ranking results, indicating that land-use indicators are meaningful variables to explain the deposit intensity variability, particularly the proportion of cropland and forest cover in the reach subcatchments. Moreover, it was observed in the Cisse catchment that most upstream reaches, displaying high sediment deposition intensity, were connected to agricultural fields (Fig. 3a), which produce higher suspended-sediment concentrations than forests or grasslands during erosive rainfall events (e.g., Cerdan et al. 2010). They may therefore supply high fine particle loads to river channels. The combination of erodible fields and a high field-to-stream connectivity (Fryirs 2017) resulted in medium to high deposition intensity in 70% of the reach lengths, as also found by Sutherland et al. (2010), Naden et al. (2016), and Konrad and Gellis (2018). This would also suggest that in this lowland, mostly agricultural catchment, transport capacity-limited conditions controlled most fine sediment deposition dynamics, as supported by the limited slopes in this catchment (mean: 3%).

The flow conditions are representative of the local stream morphology and integrate various variables that may be particularly important in controlling fine sediment deposition (McKenzie et al. 2022), particularly successive deposition–erosion cycles. Similarly, and in agreement with the correlation matrix, river reach width and depth, which directly influences water velocity and therefore sediment settling, was an important variable, as previously suggested in the literature (Datry et al. 2015). This importance may be explained by an increasing channel reach width resulting in lower fluid velocity and, consequently, increased sediment settling. This result is consistent with those of Stewardson et al. (2016). Similarly, a lower Strahler order corresponded to channels with a direct connection between river reaches and sediment sources, which may indicate why this variable correlated with sediment deposition intensity, as also reported by Relya et al. (2012). This result would support the first evidence provided by the statistical analysis (Sect. 3.1). However, the high dispersion in the relationships between deposition intensity and channel geometry indicated that the relationship is probably complex and highly nonlinear (maximum correlation coefficient was 0.4; Table S1). The model performance indicated that it successfully captured this nonlinear behavior. This result may explain the alternation of maximal deposits and no sediments on successive river reaches, as observed in the Loisance catchment, as well as the correlation between sediment deposition and the local slopes, as previously suggested by Naden et al. (2016), both of which suggests the significance of local in-stream processes in this catchment.

Consequently, the results from this study indicate the importance of considering sediment sources as well as in-stream processes when analyzing fine sediment deposition. The Beuvron catchment provided an example of the linkage between hillslopes and river processes, with various deposition intensities measured in first-order reaches and a progressive increase in sediment deposition in higher reaches. This result suggested that first- and second-order streams were more susceptible to fine sediment deposition, probably because of increased connectivity between hillslope sources and streams and their lower transport capacity (Relya et al. 2012). The model performance was slightly lower on the first Strahler-order reaches compared to those with a higher Strahler order: modeled values matched measurements 73% and 79% of the time, respectively. This may be explained by the relatively general descriptions of hillslope processes, as the current study only considered land use and the area upstream of river reaches (Table 2). This result suggested the strong contribution of sediment sources upstream with possible transport capacity-limited conditions, as supported by the mean slope (1%), while the downstream reach deposition intensity may be controlled by fine sediment availability in this lowland forested catchment.

This study provides the first general illustration of the potential of this methodology, combining hillslope and river process representations. It may therefore be a relevant contribution to study sediment (dis)connectivity in catchment (Fryirs 2013), providing an additional tool to study the catchments sediment sinks. Despite providing an extensive dataset, some processes or specific combinations of river reach characteristics were overlooked in the present study and should be further refined. Indeed, it should be noted that some residual variability remained regardless of the number of measurements used to train the model (Sect. 3.2) and was not captured by the variables included in the dataset. Further developments may rely on the use of numerical modeling to improve the representation of hillslope and river reach processes in such methodology (Baldan et al. 2021). Moreover, due to time constraints in field monitoring, only one measurement campaign was used in this study, while the progressive deposition and flushing effects of flood events should influence fine sediment erosion and deposition patterns. Future studies may therefore use multiple measurement campaigns and evaluate the model’s ability to take into account temporally varying variables. Indeed, although the spatial variability of clogging may dominate over its temporal variability (Datry et al. 2015), the latter should not be neglected, particularly considering the importance of variables related to flow, such as stream power, as indicated in the literature (Naden et al. 2016). Such variables may exhibit high temporal variability from the flood event to the annual (i.e., low-flow and high-flow periods) scale, with possible consequences on the sediment erosion–deposition cycle and therefore riverbed clogging (Genereux et al. 2008).

4.2 First attempt to catchment extrapolation

The model showed good predictive performance when the river reaches of the three sampled catchments were included in the training and testing sets. This result indicated that the training sample was representative of the entire dataset, ensuring generalization capacities (Barbiero et al. 2020; Kernbach and Staartjes 2022) when data were available for each catchment in the training set. We then tested the model extrapolation ability across a catchment with no samples included in the initial training set. To this end, training configurations excluded one or two catchments from the training phase. Model performances were then tested on the same test set, containing samples from the three catchments (Table 5). As expected for this data-driven model, removing one of the catchments led to a drastic decrease in model accuracy. For instance, removing observations from the Cisse catchment, which had the lowest number of observations, led to the highest decrease in performance (accuracy = 24%). However, this decrease was similar for each of the three catchments, suggesting that the unbalanced distribution of observations among the three catchments did not significantly influence model performance. Similarly, considering only the Beuvron catchment, which had the highest number of measured river reaches, but the smaller agricultural area proportion to train the model, resulted in poor model performance (accuracy = 45%). When considering only the Cisse catchment, the decrease in performance was important but remained smaller (accuracy = 58%). As the Cisse catchment was mainly covered with cropland areas, this result underlined the importance of considering land use to predict fine sediment deposition intensity at the catchment scale, as already suggested by the analysis of variable importance and literature (e.g., Konrad and Gellis 2018; Davis et al. 2021).

Table 5 Model performances considering different configurations of the training set

Overall, these results illustrated that the incorporation of a limited number of observations from different catchments may lead to significant improvement in the robustness of model predictions. As it may not always be feasible to monitor the entire catchment river network, the proposed model may therefore be helpful to upscale an analysis conducted on a selection of well-monitored river reaches to the entire catchment scale with satisfactory accuracy. Preliminary work based on the framework proposed in this study would help design such measurement campaigns by selecting the most influential variables and representative river reach characteristics that should be monitored. It would help target the most relevant measurements that should be performed and reduce extensive monitoring efforts, such as the one performed in this study. Additional time may therefore be dedicated to specific measurements (e.g., quantitative clogging assessment; Descloux et al. 2010) or to multiple measurements over time of some representative river reaches to study deposit temporal dynamics in relationship with, e.g., variations in bed erodibility and sediment sources, which are known to exhibit significant temporal variability (Droppo et al. 2001; Grabowski et al. 2012; Haddad et al. 2022). Future work should benefit from such measurements and should study the complementarity between the model and existing national databases to prioritize mitigation strategies, to be designed in collaboration with stakeholders, considering both sediment supply from hillslopes and in-stream sediment dynamics (Wharton et al. 2017).

5 Conclusions

An extensive database, including fine sediment deposition measurements obtained from three temperate catchments, was acquired and analyzed. A statistical analysis suggested that fine sediment deposition is a nonlinear and multifactorial process. Measurements collected during this study were used to develop and evaluate a machine-learning model using a random forest approach. The results demonstrated that this modeling approach, which has very rarely been applied to the study of sediment dynamics, may be relevant to evaluate fine sediment deposition at the catchment scale.

Variable ranking was proposed to quantify the relative importance of the different factors affecting fine sediment deposition. In line with the results from the statistical analysis, the importance of the river channels characteristics as well as that of sediment source proximity, particularly in agricultural areas, was suggested. It indicated the importance of considering both hillslope and in channel processes when analyzing fine sediment deposition in rivers.

Future work should include other monitoring campaigns conducted in catchments located in more contrasting environments to evaluate the model representativeness. Our central hypothesis was that fine sediment deposition can be studied using a single measurement campaign. However, we recognize that temporal variations in hillslope and river reach processes, including barrier maneuvers, may influence sediment deposition. Measurements performed over various periods (e.g., low-flow versus base-flow periods and the effects of flood events) should therefore be considered. Improving the representation of hillslope processes in this modeling approach may improve our understanding of fine sediment transfers along the hillslope-to-river continuum.

The proposed approach may help improve our understanding and our capacity to predict fine sediment deposition in streams, which is key to understanding sediment dynamics at the catchment scale, particularly sediment (dis)connectivity. It may also be used for prioritizing the implementation of mitigation measures and/or the design of future monitoring campaigns, which should ultimately help decision-makers improve water quality to meet the objectives of environmental legislation, such as the Water Framework Directive in the European Union.