Introduction

Cities are increasingly recognized as a crucial space for the conservation of biodiversity; they can host a wide range of plant and animal species, including endangered species (Garrard et al. 2018). In particular, urban freshwater ecosystems form dense networks within a city, connecting with suburb ecosystems and providing a heterogeneous habitat (Ranta et al. 2021). The development of cities is the main driver of changes in land use, flow paths, riparian areas, and stream channels (Roy et al. 2016). The conversion of natural habitats into anthropocentric land uses influences the biodiversity of streams and rivers (Tóth et al. 2019; Saldanha Barbosa et al. 2020). Urban and agricultural land use is known to degrade water quality and affect channel morphology (Paul and Meyer 2001; Allan 2004). Land use conversion promotes the loss of existing species and reduces functional species diversity, because established species cannot persist under the new environmental conditions (Edge et al. 2017; Leitão et al. 2018).

Fish community structures have been used to evaluate physiochemical changes in the surrounding environment (Plafkin 1989; Barbour et al. 1999). Fish can be useful biological indicators due to their long life histories in water bodies, position at the top of the aquatic food web, varying trophic and tolerance levels, and ease of identification (Karr 1981). In particular, functional guilds based on trophic (carnivore, herbivore, and insectivore) and tolerance properties (resistance to pollution) link the ecological functions and requirements of different species to the impacts of human activities (Noble et al. 2007). Fish properties reflect the spatiotemporal environment of stream ecosystems in terms of multiple biological dimensions (Atique and An 2018).

Most evaluations of the impact of environmental changes on fish have used conventional survey methods, such as kick nets, casting nets or electrofishing (Barbour et al. 1999). However, conventional capture techniques have several limitations. The gear used for collecting fish has to be selected by considering the characteristics of the target species, including size, sex, habitat, and density (Hubert et al. 2012). The characteristics of study sites, including accessibility, substrates, and vegetation, as well as time constraints, should also be considered when selecting survey methods (Vander Vorste et al. 2017). Some highly-invasive, capture-based survey methods, such as electrofishing are banned in some territories, including the EU and Korea (Association Bloom 2018). Capture-based survey methods are invasive and can damage fish health by increasing stress and post-release predation risk (Resources Inventory Commitee 1997). In traditional fish surveys that involve the use of fishing gear, survey points are typically determined at a regional scale, covering distances ranging from 10 to thousands of kilometers (Groves et al. 2002; Thornbrugh and Gido 2010; Ekroos et al. 2016). These regional survey points are often selected based on factors such as watershed areas or the Strahler order (Ministry of Environment 2016; U.S. Environmental Protection Agency 2020). The specific distance between selected survey points varies based on the sample frame, ranging from a few kilometers to tens of kilometers, or more in international fish surveys (Jia and Chen 2013; dos Santos et al. 2015). Fish surveys in South Korea typically employ fishing gear within a 5 to 30-km range (Kim and An 2015; Mamun and An 2018). Urban streams flow through areas with a variety of land use types (Roy et al. 2016; Tóth et al. 2019), so it is necessary to identify the fish community through a dense selection of survey sites. In particular, the water system in the urban area is an ecosystem that is threatened by pollution and invasive species (Paul and Meyer 2001; Leitão et al. 2018), but are also partially restored and managed for ecological function such as biodiversity (Ministry of Environment 2016a; Shaw et al. 2016). Therefore, innovative and extensive investigation methodologies are needed to continuously monitor spatially varying ecological changes in heterogeneous urban rivers.

Environmental DNA (eDNA) is an emerging method for biomonitoring that can overcome the limitations of conventional surveying and reduces costs and labor requirements (Sigsgaard et al. 2015; Thomsen and Willerslev 2015; Huver et al. 2015). The eDNA survey method detects the DNA released from living organisms in environmental samples (e.g., air, water, or soil), and can be used for biomonitoring because it provides biological information for a certain period from hours to days depending on environmental conditions after its release before it is degraded (Dejean et al. 2011; Seymour et al. 2018; Harrison et al. 2019). Research on the application of eDNA technology to surveys of aquatic organisms is actively being conducted to determine its efficiency (Takahara et al. 2012; Jane et al. 2015; Shogren et al. 2018) under various environmental conditions (Yamamoto et al. 2017a; Goutte et al. 2020). Environmental DNA metabarcoding using universal primers can be applied to identify multiple species from a single environmental sample requiring less time and labor (Hänfling et al. 2016; Ushio et al. 2018; Goutte et al. 2020). Studies have shown a positive correlation between eDNA concentration and fish biomass or population (Takahara et al. 2012; Olds et al. 2016; Zhang et al. 2019). Additionally, metabarcoding has been used to study groups of species to elucidate fish community composition and relative abundance (Sard et al. 2019), as well as to determine their spatial distribution and preferred habitat within the target area (Takahara et al. 2012; Yamamoto et al. 2017b).

This study examined the association between fish species composition and community structures (including diversity, richness, and abundance) determined through eDNA metabarcoding surveys at the local scale, in relation to environmental factors. In this context, the phrase 'local scale' is employed relatively, in comparison to conventional fish surveys typically conducted over tens of kilometers. The distance between sampling points was set at 2 km considering the size and environmental characteristics of the urban stream network under study. First, to identify the detectability of eDNA metabarcoding, we compared the results of a conventional survey based on kick and casting nets (conducted during this study) and historical reports with an eDNA survey. The fish survey performed in this study was conducted in the same way according to the national fish monitoring manual (National Institute of Environmental Research 2016). Second, the effects of different land use types (forest and grassland, urban, agricultural, and bare land) on fish community structure were investigated. Additionally, changes in fish composition were evaluated, and fish species were categorized by tolerance guild and community structure according to stream order.

Methods

Study area

Anyang stream is an urban stream that originates in the city of Uiwang, Gyeonggi-do, Korea, and flows through the cities of Gunpo, Anyang, Gwangmyeong, and Seoul to the Han River (Fig. 1). The sub-basin area is 286 km2, and the stream is 32.5 km long. The average temperature of the study area is 27.6–28.4°C in summer and 0 to – 3°C in winter. The average annual precipitation is 1346.7 mm, and precipitation occurs mostly during the monsoon season. The ‘Comprehensive Plan to Save Anyang Stream’ was implemented from 2001 to 2010, and 25.6 km of the stream within the city of Anyang has been restored to improve water quality, facilitate stream ecosystem recovery, and create waterfront space. As a result of the restoration, the biological oxygen demand (BOD), which is an indicator of pollution, decreased from 30 ppm in 2000 to 5 ppm in 2010. Fish biodiversity has also increased by an estimated 10 species compared to the period before the project. Parts of Anyang stream and its tributaries have been designated as natural conservation or restoration areas, while other parts have been restored to create artificial space, including waterfront and park areas. Consequently, the Anyang stream network is a mixture of natural streams and seminatural areas influenced by existing artificial land use, leading to high landscape heterogeneity.

Fig. 1
figure 1

Location and elevation of the study area. Circles represent eDNA sampling sites in the Anyang main stream and its tributaries (Mokgam, Okgil, Sammak, Samsung, Suam, and Hakui streams), and the color of the dots represents the stream order. Water quality measurements were conducted simultaneously at each site

eDNA metabarcoding

Sampling of eDNA and DNA extraction

To assess the variability of the fish distribution among sites, eDNA sampling was conducted at 31 locations from July 16–17, 2020. The sites were located in the main Anyang Stream (15 sites) and six tributaries: Mokgam Stream (six sites), Okgil Stream (two sites), Sammak Stream (one site), Samsung Stream (three sites), Suam Stream (one site), and Hakui Stream (four sites) (Figs. 1 and 2). Surface water was collected into sterile bags at the access points in the target area. About 240 ml of water from each site was filtered through a Sterivex filter (pore size, 0.45 µm; Merck, Darmstadt, Germany) and 30 mL syringe. Sampling volume was determined according to Mächler et al. (2016) and Wilcox et al. (2018) for identifying the overall fish fauna in study area rather than detecting rare species. After sampling, the filters were individually placed in a zipper bag at each access point, stored in an icebox, and then transported to the laboratory, where they were stored at –20°C until DNA extraction was performed. Contamination was monitored using negative controls and species that appeared in the negative control were excluded from the list of species (Appendix 1: Table S1). Extraction of DNA from the filters was performed using DNeasy Blood and Tissue Kits (QIAGEN, Hilden, Germany). The extracted DNA was quantified on a Nanodrop 2000c (Thermo Fisher Scientific, Waltham, MA, USA) and stored at –20°C.

Fig. 2
figure 2

Photographs of the sampling sites in the Anyang mainstream and its tributaries. At sites (a)–(i), both eDNA sampling and traditional surveying were conducted. The numbering of the study sites was as follows: (a) S1, (b) S4, (c) S12, (d) S16, (e) S18, (f) S24, (g) S26, (h) S28, and (i) S30

Library preparation and MiSeq sequencing

Two-step polymerase chain reaction (PCR) was conducted for library preparation. The first PCR was performed using the universal MiFish primer set (Miya et al. 2015) to amplify the mitochondrial 12S region. The procedure involved 35 cycles with a total reaction volume of 12 µL, comprising 6 µL of KAPA HiFi ready mix (KAPA Biosystems, Inc., Wilmington, MA, USA), 0.72 µL of primer mix, 3.28 µL of ultra-pure water, and 2 µL of DNA. The first thermal cycles of this step were as follows: denaturation at 98°C for 20 s, annealing at 65 °C for 15 s, extension at 72 °C for 15 s, and a final extension at 72 °C for 5 min. The PCR products were visualized using 1% agarose gel electrophoresis. Prior to the second PCR, purification of the PCR product was performed using 20 μl of Ampure XP Beads. Amplifications were diluted to 1/10 and used as a template for the second PCR, which involved 12 cycles carried out under the same conditions as the first PCR. The total reaction volume was 12 µL, including 1 µL of each unique dual index identifier (UDIs, a total of 2 µL), i.e., P5 (Nextera, S5xx) and P7 (Nextera, N7XX), 6 µL of 2X KAPA HiFi ReadyMix (KAPA Biosystems, Inc., Wilmington, MA, USA), 3 µL of ultra-pure water and 1 µL of a template DNA (1st PCR product). Amplicons from each sample were equimolarly diluted, pooled, and subjected to sequencing on the MiSeq 300PE platform (Illumina, San Diego, CA, USA) with a 600 bp read length.

Species identification based on next-generation sequencing result

The MiFish Pipeline (http://mitofish.aori.u-tokyo.ac.jp/mifish) was used to establish a list of species detected by the eDNA survey from FASTQ files of the next-generation sequencing results. The MiFish Pipeline analyses included the processes from a FASTQC data quality check for phylogenetic analyses (Sato et al. 2018). The Basic Local Alignment Search Tool (BLAST) based on local database MitoFish (version 3.75) was used to find regions of local similarity (> 97%), which represented the similarity between sequences in samples. Saltwater fish that do not be inhabited in fresh water and species of non-fish were excluded (Appendix 1: Table S1). To consider the study sites in urban area, which is geographically separated from the salty water environment, there is a possibility that genetic materials of other species, such as sewage treatment plants, flowed into the target site and were detected during sampling (Darling et al. 2021). Because of the possibility of overestimating allied species in metabarcoding, the freshwater fish species detected by eDNA secured reliability by screening with the fish species list of the Han River system, of which the Anyang stream water system is a part (Ministry of Environment 2017a, 2017b; National Institute of Biological Resources 2020). Fish identified by eDNA metabarcoding were classified by their resistance characteristics which represent the degree of tolerance of pollution. According to the National Institute of Environmental Research (2016), as outlined in Article 9–3 of the Water Environment Conservation Act 2020, fish are categorized based on their degree of pollution tolerance as follows: (i) sensitive species (SS), which are severely affected by water pollution; (ii) tolerant species (TS), demonstrating resistance to water pollution; and (iii) moderately tolerant species (IS), displaying characteristics intermediate between the other two types. The resistance characteristics of fish referred to in this study are summarized and presented in Appendix 2: Table S2.

Spatial variable measurements

The characteristics of the target site were analyzed using the Quantum geographic information system (QGIS, Desktop version 3.14.0) with a particular focus on two categories: land use around the target site and stream order (Table 1). In this study, to assess the impact of land use on fish communities, we defined and analyzed a spatial scale of 2 km or less as the 'local scale' and a spatial scale of 2 km or more as the 'catchment scale (regional scale)'.

Table 1 List of environmental variables, their codes, and their descriptive statistics

The land use was set up as a 500 m buffer around the target area for the local scale study, and 18 catchments were selected for the catchment scale study (Appendix 3: Figure S1). The catchment map ‘Korea Reach File v.3.0’ is downloaded from the water information system (https://water.nier.go.kr/web). The land use in the study was classified into four categories: urban area (Ur), agricultural area (Ag), forest and grassland (Fg), and bare land (Ba) (Fig. 3). A digital elevation model (DEM) was used and land use was identified through analysis of land cover data downloaded from the Environmental Spatial Information Service (https://egis.me.go.kr/). The land cover map used in this study was based on an airborne digital ortho-image acquired from 2017 to 2018 and was classified into 41 land use types with a 1-m resolution.

Fig. 3
figure 3

Land use proportions at the local scale (500 m buffer) and catchment scale of 31 sampling sites

The stream order, identified where differences in physical characteristics occurred, was classified using the Horton–Strahler method and represented the stream size (Mamun and An 2018). According to the Horton–Strahler systems, headwater stream links are assigned an order of one and if a stream is joined by another of the same order, the stream order rises by 1 (Horton 1945; Strahler 1957). It is one of the factors that reflect changes in ecological characteristics such as water quality and fish species according to the longitudinal gradient of the stream (Vannote et al. 1980).

The basic water quality parameters of the eDNA survey points were measured using a Pro Plus multiparameter water quality meter (YSI, Yellow Springs, OH, USA). Four parameters were measured: water temperature (Temp, ℃), dissolved oxygen (DO, mg L−1), pH, and conductivity (Cond, μS cm−1).

Comparison of the eDNA survey and conventional field surveys

The fish species obtained through eDNA survey described in ‘2.2 eDNA metabarcoding’, were cross-referenced with those obtained through conventional field survey and historical data survey, enabling identification of shared and unique species between each methodology. The procedures used to compile the fish list for the conventional field survey and historical data were as follows.

Conventional field survey using fishing gear

To compare the eDNA and conventional survey results, nine sites (S1, S4, S12, S16, S18, S24, S26, S28, and S30) considered to have physiochemical characteristics representative of each stream were selected (Fig. 2). Conventional field sampling was conducted in August 26–31, 2020, using a kick net (4 × 4 mm) and casting net (6 × 6 mm). Based on the temperature pattern in Korea, seasons are classified into spring (March–April), summer (June–August), and autumn (September–October) (Choi et al. 2006) thus assuming that July and August samples were acquired during the same summer season. Fish were collected over 40 min using the kick net, and 10 times using the casting net, at each survey point. The fish collected were released at the site after on-site species level classification by morphological traits based on Korean reference books (Kim 1997; Kim and Park 2002; Kim et al. 2005). This survey method is used in the national natural environment survey conducted regularly in Korea (Ministry of Environment 2017a, 2017b).

Historical data survey

The fish list of historical monitoring literature, here after historical data, was derived from Anyang stream (2016–2017), and the 4th National Natural Environment Survey (Ministry of Environment 2017a, 2017b). The method used to construct literature data is the same as the field survey method in this study ‘2.4.1 Conventional field survey using fishing gear’. Field survey sites included a total of 26 sites in the Anyang stream network which is the same research spatial scope as this study. The regular monitoring by municipal governments and the Korea ministry of environment were obtained by conventional surveys using both casting and kick nets that were conducted in June and October from 2017 to 2019 (Appendix 4: Table S3).

eDNA metabarcoding-based fish community structure analyses and correlations with environmental factors

Fish abundance and the proportion of the total individuals calculated for fish diversity were estimated according to the natural logarithm of the total number of reads of each species detected at the study site. Utilizing log-transformation for the count of eDNA reads is justifiable for analysis, particularly in estimating species abundance correlated with biomass/density (Rourke et al. 2022; Nakahara et al. 2012; Yates et al. 2019, 2021) and considering the decay rate, where shorter DNA persistence in the environment aligns with proportional DNA copy representation (Breton et al. 2022). Therefore, it was judged that log-transformation was appropriate to check the inhabitation trend of fish, and the number of eDNA reads were used to estimate the diversity and abundance of fish. The equation used in the fish diversity analysis was as follows (Shannon 2001):

$${H}{\prime}= -\sum_{i=1}^{s}{\text{Pi}} \bullet {\text{lnPi}}$$

where S is the number of species in the community and Pi is the proportion of the total individuals belonging to a particular species. Richness represented the number of species detected per sample at the study site.

Multiple regression analyses with fixed effects were used to investigate the relationships between fish community structures (response variables) and environmental factors (explanatory variables) at 31 study sites, enhancing precision in isolating the effects of variables of interest such as water quality and physical traits (Maas and Hox 2006; Du and Wang 2016). As the type of land use affects the water quality, water quality parameters were considered simultaneously with the land use type for the multiple regression analyses. The environmental factors consisted of water-quality characteristics (Temp, Cond, DO, and pH) and physical traits, including land use ratio, stream order and elevation. The land use ratio by spatial scales was used in multiple regression models separately to consider the effect of each scale on fish community structures. The explanatory variables were filtered by a stepwise algorithm bidirectional elimination. Bidirectional elimination selected variables by comparing the AIC value from the number of cases where all variables are considered to the case where a specific variable is excluded (Chambers and Hastie 1992). The automatic bidirectional algorithm considers the relatedness between variables and cross-validation and determines the number of model selection (Vittinghoff et al. 2012). In the variation inflation factor check, variables with a value of over 10 were excluded to prevent multicollinearity. Spatial autocorrelation was excluded by spatial thinning based on the home range of the freshwater fish species (Lewis and Flickinger 1967; Jones and Stuart 2007; Lapointe et al. 2013). Moreover, excluding spatial autocorrelation prevents overestimation and failure of spatial variable aggregation arising from different resolutions of spatial variables (Gangodagamage et al. 2008; Sillero and Barbosa 2020). Analysis of variance (ANOVA) was used for comparing the original model and variable selected model after the stepwise algorithm and confirming that excluded variables had no significant contribution to the model.

The Kruskal–Wallis test was employed to assess potential statistically significant differences in the fish community structure indices (abundance, richness, and diversity) in relation to stream orders. The post hoc test involved evaluating significant differences of fish composition by tolerance guilds among stream orders using Mann–Whitney test by Bonferroni’s method. A statistical difference in fish communities among stream orders which are classified by the Strahler–Horton method (Horton 1945; Strahler 1957) was evaluated by analysis of similarities (ANOSIM) in R. In addition, nonmetric multidimensional scaling (NMDS) with sequence data without log-transformation derived from a refined fish species list was used to describe the fish species distribution pattern and fit the environmental factors on a 2-dimensional plot. To assess the impact of land use ratio on fish distribution patterns at different scales, the NMDS analysis incorporated two distinct scales of land use ratio simultaneously. Calculations were performed using the Vegan package version 2.5–7 (Oksanen et al. 2015) in R software (version 4.0.2, R Core Team 2021).

Results

Across 31 samples, a total of 2,099,959 eDNA reads were obtained and 89 species were detected. In the negative control, only Homo sapiens was detected and removed from the species detection result. The raw number of eDNA reads of 31 sampling sites was 67,740 ± 18,004 (Mean ± SD). After the quality filtering by removing non-freshwater fish and non-fish species, a total of 1,419,062 eDNA reads were obtained which is 67.58% of raw sequence data and eDNA reads per site reduced to 45,776 ± 19,177 (Mean ± SD). A total of 56 species, including saltwater fish that do not inhabit in fresh water and species of non-fish, were excluded (Appendix 1: Table S1).

The results of the eDNA survey conducted in Anyang stream identified 33 species belonging to 13 families of freshwater fish, after quality filtering (Appendix 5: Table S4). Three additional species were identified in the main stream (average of 12.8 ± 3.16 species), which was more than in tributaries (9.8 ± 5.3 species). The presence of less than three fish species at upstream sites S24, S25, and S31 influenced the standard deviation of the number of species detected in tributaries. The dominant species were Pseudorasbora parva, with a total of 446,654 reads (31.48%), followed by Rhynchocypris oxycephalus (249,916, 17.61%) and Odontobuta interrupta (224,882, 15.85%). At the family level, Cyprinidae accounted for 68.63% of the total, followed by Odontobutidae (15.85%), Cobitidae (7.69%), Mugilidae (3.19%), and Chanidae (2.12%). Four species of exotic fish were identified: Micropterus salmoides, Lepomis macrochirus, Paramisgurnus dabryanus, and Carassius cuvieri (Appendix 5: Table S4).

The conventional field survey yielded a total of seven families and 18 fish species at nine sites (Appendix 6: Table S5). The dominant species in the target area was Zacco platypus, with a total of 186 individuals (15.90%) followed by Rhynchocypris oxycephalus (151 individuals, 15.19%), Lepomis macrochirus (145 individuals, 14.59%) and Carassius auratus (110 individuals, 11.07%). At the family level, Cyprinidae (74.04%) accounted for the largest relative abundance, followed by Centrarchidae (14.59%) Cobitidae (4.12%), Odontobutidae (4.12%), Gobiidae (2.62%), Poeciliidae (0.40%) and Channidae (0.10%). Among the exotic species, Lepomis macrochirus, the ornamental fish Poecilia reticulata and Carassius cuvieri were collected.

As a result of conducting eDNA surveys and traditional surveys using fishing gear at the same point, the detection species by eDNA survey was from 2 to 17 species while the collected species by traditional survey was from 2 to 10 (Appendix 5: Table S4, Appendix 6: Table S5). The difference in the number of identified species ranged from 0 to 11 species according to the survey method, with an average difference of 3.11 species and a standard deviation of 3.38. Additionally, based on the eDNA and conventional field surveys, it was confirmed that common fish species distributed throughout Korea dominated the Anyang stream network.

Comparison of the eDNA survey and conventional field surveys

For an accurate species list comparison, only the species detected at the nine sites where the conventional and eDNA surveys were conducted were compared. In total, 17 species were found in historical data and 18 species were found in a conventional field survey. The eDNA surveys detected 12 of the 17 species (70.6%) that appeared in the historical data. Of the 18 species identified in traditional surveys, 12 (66.7%) were found in eDNA surveys (Fig. 4). Seven species were identified by all survey methods: Carassius auratus, Cyprinus carpio, Misgurnus anguillicaudatus, Odontobutis interrupta, Pseudogobio esocinus, Rhynchocypris oxycephalus, and Zacco platypus. Among the commonly observed species, six species of Cyprinidae, one of Cobitidae, and one of Odontobutidae were identified. Ten species were observed only in eDNA surveys. Eight species were exclusively found in either the historical data or the conventional field survey (Fig. 4). Furthermore, the eDNA survey revealed an additional 10 fish species, such as Acheilognathus macropterus and Anguilla japonica, compared to the species identified in the historical data and collected through conventional field surveys (Fig. 4).

Fig. 4
figure 4

Venn diagram comparing fish species among the eDNA survey (A), traditional survey (B), and historical data (C). The number of collected species is in parentheses

Effects of land use on fish community structure

Fish community structures (abundance, richness, and diversity) calculated by the eDNA survey, were significantly correlated with land use and water quality parameters in multiple linear regression analyses (see Table 2 for p-values). The model best-describing the environmental factors affecting fish community structures varied depending on the scales of the analysis, i.e., local (500 m buffer) or catchment scale (Table 3). At the local scale, the regression analyses indicated that pH and the proportion of urban area (LUr) and forest and grassland (LFg) were positively correlated with species abundance (p < 0.05), while the elevation was negatively correlated with abundance (p < 0.001). In catchment scale analyses, the CUr and CFg were not significant factors while the proportion of agricultural area (CAg) was negatively correlated with abundance (p < 0.001). According to the result of abundance, LUr and LFg were associated with an increase in the size of the fish population, whereas CAg was associated with a smaller fish population. Species richness displayed a similar pattern to that of abundance. The number of species increased with the LUr and LFg, whereas the CAg had a negative association with species richness (Table 2). Species diversity trends were in the opposite direction to those of abundance and richness. The expansion of LAg would negatively affect maintaining the diversity of the fish community (p < 0.05), while the expansion of CUr and CFg would have a positive effect on fish diversity (p < 0.01) (Table 2). Elevation was negatively correlated with community structure (p < 0.001), while pH had a significant positive association with all metrics at the local scale, except richness (p < 0.01).

Table 2 Results of multiple regression models of the environmental variables affecting fish abundance, richness, and diversity at the local and catchment scales
Table 3 Set of candidate models for describing fish community structures on a local scale and catchment scale land use analysis. The Akaike information criterion (AIC) was calculated to evaluate the relative quality of each statistical model

Differences in fish community structures according to stream order

After classifying the detected fish species according to tolerance guilds (Appendix 5: Table S4, Appendix 2: Table S2), fish community structure and stream order were found to be correlated (Fig. 5). According to these reclassified data, there were 3 sensitive species (9%), 13 moderately tolerant species (39%), and 17 tolerant species (52%) in the study area. Mann–Whitney test for post hoc test indicated that the community structure of all tolerance guilds, except moderately tolerant species, was significantly correlated with stream order (Fig. 5, p < 0.05). The community structures (i.e., richness, diversity, and abundance) of sensitive species, which were strongly affected by water pollution, decreased in the order of first-, second-, and third-order streams. Conversely, for moderately tolerant species, there was an increase in richness and diversity in the order of first-, second-, and third-order streams, although there was no significant difference in abundance among the streams. Similarly, the community structure of tolerant species, which were resistant to water pollution, was highest in the first-order streams. However, there was no significant difference between the second- and third-order streams. This indicated that the second- and third-order streams had similar environmental characteristics and fish compositions. Diversity and abundance were higher in larger streams, but this could have been due to an increase in moderately tolerant and tolerant species.

Fig. 5
figure 5

Violin plots of fish community structure parameters (abundance, richness, and diversity) and tolerance guilds (sensitive, moderately tolerant, and tolerant species) according to stream order (first, second, or third) (*p < 0.05). The white box in the violin plot indicates the interquartile range and the black line in the middle is the median value. The width of the violin plot represents the probability of observations with a given value

To enhance comprehension of the interplay between fish composition and environmental factors, the NMDS analysis was additionally used to evaluate relationships among land use, stream order, and water quality on fish distribution (Fig. 6, Table 4). Comparison of the clusters by ANOSIM confirmed a significant difference in stream order (R = 0.294, p < 0.001), but second-and third-order streams did not have significantly different fish communities. This result was similar to that shown by the violin plot, in which there were no differences in fish community structure between second-and third-order streams (Fig. 5, Fig. 6). Axis 1 of the NMDS was significantly affected by the Ur (p < 0.001) and Fg (p < 0.001) at the local and catchment scales, and in terms of elevation, while axis 2 was influenced by the Ag at the local scale (p < 0.05) (Table 4). The proportion of Ur was positively correlated with Temp, especially at the local scale. At the catchment scale, the proportion of Fg and Ba were positively correlated with DO, while the proportion of Ag was negatively correlated. The proportion of Ag at both scales had a positive correlation with Cond, whereas the proportion of Fg displayed the opposite result.

Fig. 6
figure 6

Nonmetric multidimensional scaling (NMDS) analyses of fish composition and environmental factors, including land use, water quality, and elevation. The ellipse is derived from the stream order based on the standard deviation. The axes in the NMDS plot are as follows: LAg: Local scale agricultural area, LFg: Local scale forest and grassland, LUr: Local scale urban area, LBa: Local scale bare land area, CAg: Catchment scale agricultural area, CFg: Catchment scale forest and grassland, CUr: Catchment scale urban area, CBa: Catchment scale bare land area, Temp: Water temperature, and DO: Dissolved oxygen. The species in the NMDS plot are O. sine: Oryzias sinensis, A. japo: Anguilla japonica, T. fulv: Tachysurus fulvidraco, C. argu: Channa argus, M. angu: Misgurnus anguillicaudatus, M. mizo: Misgurnus mizolepis, P. dabr: Paramisgurnus dabryanus, A. rivu: Abbottina rivularis, A. inte: Acheilognathus intermedia, A. macr: Acheilognathus macropterus, A. chan: Acheilognathus chankaensis, C. aura: Carassius auratus, C. cuvi: Carassius cuvieri, C. carp: Cyprinus carpio, G. stri: Gnathopogon strigatus, H. leuc: Hemiculter leucisculus, N. temm: Nipponocypris temminckii, P. vail: Pseudogobio vaillanti, P. parv: Pseudorasbora parva, R. oxyc: Rhynchocypris oxycephalus, S. sold: Sarcocheilichthys soldatovi, S. grac: Squalidus gracilis, Z. plat: Zacco platypus, G. urot: Gymnogobius urotaenia, R. giur: Rhinogobius giurinus, L. haem: Liza haematocheila, L. cost: Lefua costata, M. swin: Micropercops swinhonis, O. inte: Odontobutis interrupta, P. alti: Plecoglossus altivelis, S. micr: Silurus microdorsalis, M. salm: Micropterus salmoides, and L. macr: Lepomis macrochirus

Table 4 Pearson correlations between variables and ordination axes of the NMDS. Variables correspond to the proportion of land use by local (L−) and catchment area (C−), water quality parameters, and elevation

Discussion

eDNA survey as a fish investigation method in restored urban streams

An eDNA survey requires less labor and monetary investment than conventional survey methods using fishing gear such as kick and casting nets that apply to national fish monitoring (Peck et al. 2003; National Institute of Environmental Research 2016; Sard et al. 2019; Goutte et al. 2020), and is considered useful to describe differences in fish composition within an urban stream network (Nakagawa et al. 2018). In this study, Cyprinidae accounted for the highest proportion in common in eDNA surveys and conventional surveys, suggesting that Cyprinidae are easily collected and detected due to their high population density in study area (Skelton et al. 2022). It is essential to juxtapose the obtained fish detection results with historical and conventional survey data to evaluate the eDNA-based detectability of fish species. Previous literature reviews represented that eDNA surveys as a viable biomonitoring methodology, consistently detecting 67% to 88% of species documented in traditional, capture-based surveys (Häsnfling et al. 2016; Nakagawa et al. 2018; Gillet et al. 2018). In this study, eDNA successfully detected over 65% of the fish species that were concurrently identified in conventional surveys and literature, demonstrating its acceptability in species monitoring even in the presence of timing variations between survey methods. The fish species data collection through kick nets and casting nets involved 21 investigators who can distinguish species by their morphological characteristics in 7 field works (5 ~ 12 sites) of 3 projects during 2014 ~ 2017 and 2020. However, the eDNA survey in this study was able to grasp the characteristics of the fish community in 31 sampling sites just in 2 days with 2 people not for species classification but sampling and water quality measurement (Appendix 4: Table S3). In our study, six species were not identified by an eDNA survey despite being in the MiFish database and appearing in historical data or conventional field survey results. The reason for these discrepancies was investigated. One of the eight species, P. reticulata, an ornamental tropical fish species that thrives in water temperatures above 25℃, is unsuitable for surviving the winter when water temperatures plummet below 10℃. Thus, it is regarded as a non-resident species, with July sightings attributed to human-mediated introductions. H. eigenmanni, T. brevispinis, A. lactipes, P. herzi and S. asotus are species that did not appear in the literature but were collected in field surveys. Due to the occurrence of type 1 (false-positive) and type 2 (false-negative) errors, ichthyofauna identifications obtained from eDNA-based surveys may be contentious (Roussel et al. 2014; Lahoz-Monfort et al. 2016). This is related to the recall (hit rate), i.e., the proportion of true results that are identified by sampling as true (e.g., the percentage of fish collected alive). In traditional monitoring surveys, false-negatives and -positives, such as those related to the misidentification of species, are a common concern (Robert Britton et al. 2011; Lintermans 2016). We also experienced this inconsistency, because only 40.0% (10/25) of the species were present in both the traditional field survey and historical data, despite the use of the same conventional method (Fig. 4). Further studies with repeated experiments that consider the timing, collection method, and sampling volume should be conducted in urban streams to obtain stable survey results and reduce variations of detected species among samples.

Effects of land use and stream order on fish distribution

In this study, the NMDS results indicated that fish community structures (abundance, richness, diversity) were affected by water quality parameters, stream order, and land use types. In short, we found that anthropocentric land uses and stream order affected fish distribution by modulating the physiochemical properties of streams. The relationship between land use and water quality parameters were similar to those reported in previous studies (Fig. 6, Table 4). For example, the Urban area is known to be associated with high Temperature due to the shortage of riparian vegetation and urban heat island effect, while organic contaminants are attributable to increases in the amounts and types of pollutants in runoff (Allan 2004; Paul and Meyer 2001). Agricultural area degrades the water quality of streams, alters channel morphology and in-stream sediments, and results in higher inputs of nutrients, sediments, and organic matter (Walser and Bart 1999; Allan 2004; Mamun and An 2018). Huang et al. (2016) found that when the proportion of forest area was high, the DO content in water increased, but the forest area was negatively correlated with conductivity, nutrients, and pH. In addition, the result that fish community structures could be classified based on resistance characteristics (which differed according to stream order) was similar to that of Atique and An (2018), who found a decrease in the richness of sensitive species and increase in the richness of tolerant species in downstream locations. Our results indicated that the proportion of Ag had a negative effect on fish community structure, while the proportion of Fg had positive effects. Contrary to the expectation that the proportion of Ur would negatively affect fish community structure, the proportion of Ur was positively correlated with the community structure. It is well known that the population size and diversity of fish increase with stream size (Vannote et al. 1980; Vander Vorste et al. 2017). The study sites in first-order streams were mainly distributed in the Fg, while study sites with a higher stream order were located in the Ur. Thus, the positive correlation between fish community structure and the urban area might have been due to most of the urban area being located at low elevations with second- and third-order streams.

Methodological considerations: potential and limitations

Our eDNA survey results revealed a difference in fish distribution among sites even though the survey was conducted at a fine spatial scale with dense sampling sites. Based on the detailed survey results, the Fg had a more positive influence on fish community structure than the Ur at the local and catchment scales, while the Ag had a negative effect on fish community structure, especially at the catchment scale. It is, therefore, important to manage the Fg in urban stream networks to improve fish abundance and richness, whereas expansion of the Ag in catchments should be considered carefully. Closely selecting sampling sites for fish fauna assessment is essential to fully capture the pronounced spatial heterogeneity of urban streams, emphasizing the importance of monitoring at a local spatial scale under 2 km. Moreover, the relationship between stream order and fish community structures found in this study was influenced by moderately tolerant and tolerant species, which accounted for most of the resistance characteristics (30 of 33 species; 91%) of fish in the Anyang stream network. Similar to studies using fish ecological characteristics and community structures as indicators of the health of streams, the results of this eDNA survey also have potential for evaluating the environment of urban streams.

The analysis of land use effects was restricted by differences in the relative importance of different land use types as criterion variables depending on the study scale. We examined the impact of land use in specific regions by setting up a 500-m buffer zone in the catchment area of the study site. Thus, a combination of land use types could affect the river environment in a complex manner (Utz et al. 2010). Anyang stream is an urban stream, and the feasibility of determining the effect of any one type of land use on the water environment may be limited. Also, it was not easy to determine the potential positive and negative effects of environmental factors on the fish community at the local scale. At the catchment scale, it can be difficult to determine the physical effects of the riparian environment on streams (Bierschenk et al. 2019). In addition, historic land uses need to be considered when evaluating the impact of changes in land use on streams. Previous studies have shown that long-term water quality monitoring data, together with current land use and fish fauna data, are required to evaluate the effect of land use patterns on fish species (Huang et al. 2016). Therefore, to determine the impact of land use on fish communities in urban streams in future studies, it will be necessary to evaluate changes in land use after accumulating fish survey data over time at the same location.

To evaluate fauna alterations and their abundance using eDNA metabarcoding, it's important to consider certain limitations associated with this methodology. For example, eDNA methodology is influenced by various factors, both biotic (such as distribution, density, and feeding activity) and abiotic (including water temperature, depth, and flow rate), which can introduce biases and make abundance estimation challenging (Rourke et al. 2022). Additionally, metabarcoding may experience amplification bias, leading to inaccurate estimates of abundance (Krehenwinkel et al. 2017). Nevertheless, based on findings from investigations conducted in various controlled experimental and natural environments, it has been observed that DNA read counts often exhibit a positive correlation with biomass and abundance. This suggests their potential as a method for estimating abundance (Takahara et al. 2012; Ushio et al. 2017; Krehenwinkel et al. 2017; Yates et al. 2019). For instance, Breton et al. (2022) demonstrated a positive correlation between amplification levels and the abundance of fish and amphibians in mesocosms, while di Muri et al. (2020) found a proportional relationship between the number of fish populations in natural lake environments and DNA read counts. Consequently, these findings have led to the use of eDNA metabarcoding for comparing seasonal changes in organism abundance and calculating community structures. However, to secure more reliable eDNA metabarcoding results to represent biomass and abundance, sampling design considering the volatility caused by environmental factors is required (Jo et al. 2019). Utilizing relative read abundance (RRA) can help provide population-level estimates of species abundance while mitigating metabarcoding biases (Deagle et al. 2019). Additionally, for preventing PCR bias, it is necessary to set appropriate primer and PCR conditions suitable for the target species and detection purpose.

Conclusion

This study employed environmental DNA metabarcoding to investigate the impact of land use and stream order on fish composition in an urban stream network. The research, conducted in 31 sites within the Anyang stream network in Korea, revealed that eDNA sampling successfully detected more than 65% of the fish species found in historical and catch based conventional surveys. Despite the selection of densely spaced survey points at 2 km intervals in a single stream network, the study revealed that fish composition reflected the heterogeneity of urban freshwater ecosystems according to physical characteristics including land use and stream order. The study demonstrated positive correlations between the proportions of urban areas (Ur), forest and grassland (Fg), and fish abundance as well as species richness, while revealing a negative correlation with the proportions of agricultural area (Ag). Moreover, a shift in fish community composition was observed from first- to third-order streams, with a decrease in sensitive species and an increase in tolerant species. This suggests that ecologically restored streams within anthropocentric urban areas can attain ecological properties and serve as refuges for sensitive species. Furthermore, it underscores the need for more extensive surveys at a finer spatial scale to comprehensively evaluate the state of urban streams.