Introduction

Rivers are important freshwater resources used for domestic water, agricultural irrigation, and industrial purposes, the water quality is related to the safety of domestic water for coastal residents (Vega et al. 1998; Razmkhah et al. 2009). The water quality is not only affected by natural factors such as precipitation and land use type, but also by human activities such as industrial wastewater and domestic sewage discharge. It is well known that rivers have heavily polluted recently because of intensification of industrial and agricultural activities and the increase of population density (Nakagawa et al. 2019; Shao et al. 2020). The industrial and domestic wastewater as well as agricultural were discharged into rivers with surface runoff, atmospheric deposition, and land surface erosion processes (Qiu et al. 2017; Gurjar et al. 2019). At the same time, these pollutants undergo physicochemical and biological reactions in river water, river water quality has seasonal and regional characteristics. Therefore, it is essential for effective river management to monitor and assess the river water quality regularly (Bo et al. 2009).

At present, river water quality evaluation methods include single factor index (SFI) method, Nemero index method, artificial neural network (ANN), comprehensive pollution index (CPI) method, etc. (Xu et al. 2022; Zhou et al. 2020; Kouadri et al. 2021). Among them, the single factor index evaluation result is conservative, but it can accurately identify the main pollution factors and water quality categories. The comprehensive pollution index is proposed on the basis of the single factor pollution index method, which is an important method to evaluate water pollution and can evaluate the water pollution status comprehensively and comprehensively. Scholars at home and abroad have made in-depth evaluation on the water quality of the basin by using the comprehensive pollution index method. Zhang et al (2021) found that 7% of the five rivers in Baihua Lake Basin were seriously polluted in three periods using the comprehensive pollution index method, and CODCr, BOD5, NH3-N and TP were the main pollution sources. Bai et al. (2020) found that the comprehensive pollution index has fluctuated in the past 30 years (1988–2016) of Baiyangdian and water quality was the worst in 2015. Taking into account the spatiotemporal changes of river water quality, regularly monitoring rivers is necessary to evaluate their water quality reliably. These monitoring programs usually collect a large number of data sets which is complex to understand. Researchers found that multivariate statistical techniques (MST) can effectively simplify the data and obtain the spatiotemporal characteristics of water quality. Varol et al. (2012) applied multivariate statistical techniques to evaluate the spatiotemporal variations and identify the main parameters affecting the change of water quality of dam reservoirs in Tigris River Basin. Yang et al. (2020) applied cluster analysis (CA) and discriminant analysis (DA) to study the spatiotemporal variations and obtain the main problem in the Panzhihua section of the Yalong River is high water organic pollutant content. Therefore, multivariate statistical technique is an effective method for evaluating water quality characteristics.

Qujiang River, also known as Quhe, is the largest tributary on the left bank of Jialing River. It is the main source of water supply for urban domestic water and industrial and agricultural water, especially in the two sides of the main stream and the tributaries of Zhouhe River Basin, where agricultural production is developed. The water quality of the area is related to the water safety of the coastal residents as well as the growth of crops along the river. At present, the research on Qujiang River Basin mainly focuses on rainfall and flood control, a few reports on the change of water quality in Qujiang River Basin. Hence, this study evaluates the water quality status of Qujiang River Basin through single factor index and CPI, and analyzes the spatiotemporal characteristics of water quality and its pollution sources combined with multivariate statistical technology (MST), aiming to provide data reference and theoretical support for the ecosystem management and water environment protection of Qujiang River Basin.

Materials and methods

Monitoring area

Qujiang River (Fig. 1), is the largest tributary on the left bank of the Jialing River (106°33′-107°16′E, 30°04′-31°03′N). It originates from Tiechuan Mountain in the Micang Mountain at the junction of Sichuan and Shaanxi. Furthermore, the confluence between Sanhui Town and Bahe, is known as Qujiang. It is 723 km long, with a drainage area of 39,211 km2, thus accounting for approximately 26% of the Jialing River drainage area. Qujiang River Basin mainly flows through northeast in Sichuan Province. It is not only a grain production area, but also an area rich in forest and timber resources. In addition, the average annual discharge of Qujiang River Basin is 730 m3/s.

Fig. 1
figure 1

The distribution of monitoring sections of the Qujiang River

Sampling and chemical analysis

Twelve parameters were selected on the basis of the sampling continuity of all selected monitoring sections (Tuanbaoling, Baita, and Sailong) monthly from 2015 to 2019 in Qujiang River. The samples were analyzed for 12 parameters, which include water temperature (WT), pH, dissolved oxygen (DO), permanganate index (CODMn), five-days biochemical oxygen demand (BOD5), ammonia nitrogen (NH3–N), chemical oxygen demand (COD), total phosphorus (TP), total nitrogen (TN), fluoride (F), fecal coliforms (F.coli), electrical conductivity (EC), and flow rate (Q). All the parameters of water quality were expressed in mg·L−1 except for WT (℃), pH, F.coli (N/L), EC (ms/m), and Q (m3/S). Table 1 shows the water parameters, their units and the methods of analysis.

Table 1 The monitoring methods of twelve parameters and standard values for Environmental Quality Standard for Surface Water Class 3 (GB3838-2002, China)

Data treatment and analysis

The reliability of CA is realized by standardizing the data because of the water parameters differ in quantity and unit of measurement (Alberto et al. 2001; Singh et al. 2005). IBM SPSS 23.0 was used for statistical calculations of all data in this study, including multivariate analysis of the data set on the water quality of the river was conducting using CA, PCA, and DA (Simeonov et al. 2003). In addition, single factor index method and CPI are used to evaluate the main pollutants and pollution degree of water body in Qujiang River.

Data treatment and analytical methods

Cluster analysis

CA is a group of statistical analysis techniques that divide the research objects into relatively homogeneous groups. According to the similarity between the research variables, the variables with the highest similarity are clustered into a class. The similarity of variables in the same category is high, and there are great differences among variables in different categories (Vega et al. 1998). In this study, the Euclidean square distance method and Ward minimum variance method are used to cluster the water quality parameters of Qujiang River, so as to obtain the time characteristics of water quality in Qujiang River.

Discriminant analysis

DA is a statistical method to classify observation objects, which is used to distinguish clustering results and identify significant pollution indicators (Chen et al. 2017). It is different from cluster analysis. The observation objects are divided into several categories before DA to establish a discriminant function (DF) from existing observation objects of known categories (Wunderlin et al. 2001). The procedure is to determine the discrimination coefficient through a large amount of data of the research object, calculate the discrimination index, and judge what kind a sample belongs to. Compared with DA, CA firstly classifies the samples. Then, it uses DF to distinguish the attributes of the samples and identifies the important pollution parameters of the research objects. The corresponding discriminant function expression is (Johnson and Wichern 1992):

$$f\left( {G_{{\text{i}}} } \right) = k_{{\text{j}}} + \mathop \sum \limits_{{{\text{j}} = 1}}^{{\text{n}}} w_{{{\text{ij}}}} p_{{{\text{ij}}}}$$
(1)

where i is the number of group types (G); n represents the number of indicators participating in discriminant analysis; wij represents the corresponding discrimination coefficient; pij is the number of indicators participating in discriminant analysis; f is the DF; and kj is the intrinsic constant to each group.

Factor analysis/principal component analysis

The principal component analysis (PCA), also known as quantitative analysis, is a multivariate statistical analysis method that uses the idea of dimension reduction to recombine multiple indicators with certain correlation into a few independent comprehensive indicators through certain mathematical methods. It can extract the eigenvalues and eigenvectors from the covariance matrix of original variables. The principal components (PCs) are the uncorrelated (orthogonal) variables obtained by multiplying the original correlated variables with the eigenvector, which is a list of coefficients (loadings or weightings) (Vega et al. 1998; Helena et al. 2000). Factor analysis (FA) is a statistical technique to extract common factors from variable groups, which can further reduce the contribution of less significant variables obtained from PCA. The new group of variables is extracted by rotating the axis defined by PCA, named vari-factors (VFs) (Helena et al. 2000; Vega et al. 1998).

Single factor evaluation and comprehensive pollution index

The single factor evaluation method compares the measured concentration of a pollution index with the evaluation standard of the pollution index, and uses the category of the single index with the worst water quality to determine the comprehensive water quality category of the water body. The comprehensive pollution index method is a water quality evaluation method that adds the single factor index to calculate their arithmetic mean. This method can not only judge the water pollution status of rivers, but also analyze the change trend of water quality. Table 2 shows the comprehensive pollution index and its corresponding water pollution degree.

$$P_{{\text{i}}} = C_{{\text{i}}} /S_{{\text{i}}}$$
(2)
$$P = \left(\mathop \sum \limits_{i = 1}^{n} P_{i} \right)/n$$
(3)

where Pi is the pollution index of water quality parameter i; Ci is the measured concentration of water quality parameter i; Si is the class 3 standard limit of water quality parameter i in the environmental quality standard for surface water (GB3838-2002); P is the comprehensive pollution index; n is the total number of indicators.

Table 2 Comprehensive pollution index and water pollution degree

Result

Classification of water quality parameters of Qujiang River

Combined with the environmental functions and protection objectives of surface water, the water quality of Qujiang River is judged by whether it meets the class 3 value of Environmental quality standards for surface water (GB3838-2002, China). Figure 2 shows that the water quality parameters of Qujiang River basically meet the class 3 value (GB3838-2002, China). Among them, TN concentration (Fig. 2h) in Qujiang River exceeds class 3 value of environmental quality standards for surface water for most of the time, and even exceeds class 5 value. Therefore, the high content of nitrogen nutrients is the key index of Qujiang River water quality control. The number of F.coli in class 3 surface water should be less than 10,000, while the number of F.coli (Fig. 2j) in Qujiang River is much higher than that in surface water for some time. In January 2015, the maximum number of F.coli reached 92,000 in Tuanbaoling Section, it is 9.2 times of Class 3 value (GB3838-2002, China).

Fig. 2
figure 2

Classification of water quality parameters of Qujiang River from 2015 to 2019 based on surface water environmental quality standard (GB3838-2002, China). Note: Class 3 of surface water was marked as blue dotted line

Main pollutants and pollution degree of water quality in Qujiang River

Table 3 shows the value of the single factor index and comprehensive pollution index of water quality parameters at the three monitoring sections (Tuanbaoling, Baita, and Sailong) from 2015 to 2019. The CPI of the three monitoring sections varies from 0.62 to 1.06 and the water quality is characterized by slight pollution. However, the CPI of Tuanbaoling Section more than 1, and the water quality shows heavy pollution. In addition, the single factor index of TN in three monitoring sections and F.coli in Tuanbaoling is more than 1, respectively, indicating that the TN and F.coli in this section exceed the specified water quality standard limit, and the single factor index of TN is large, which further indicates that the more the water quality exceeds the standard, the more serious it is.

Table 3 Evaluation of pollution index of three monitoring sections

Time grou** of water quality parameters of Qujiang River

Through clustering, the pedigree map is divided into the following three groups when the squared Euclidean distance is  ≥  20 and  <  25 according to the similarity characteristics of water quality (Fig. 3): Group 1 corresponded to dry season, including January-April and November–December; Group 2 with the lone month of July, was consistent with the flood season; and Group 3 (May–June and August–October) covers the flat season. The value of Wilks’ lambda for the DF is small (0.000), the χ2 value is high 38.81, and the significance (0.003) is less than 0.05 (Table 4), indicating that the time DA is significant. Tables 5 and 6 present the DFs and classification matrices (CMs), respectively, as obtained by using the temporal DA. These tables show that stepwise DA requires only four main indicators of water quality to construct the DFs. At the same time, pH, CODMn, BOD5, and TP are the most important water quality parameters to distinguish the temporal groups. Figures 3, 4, 5, 6, 7 show the results of the temporal DA of water quality parameters in Qujiang River.

Fig. 3
figure 3

Time clustering pedigree of water quality parameters of Qujiang River

Table 4 Wilk’s lambda and chi-square values of DA of temporal variations
Table 5 The water quality parameters to distinguish the temporal groups and classification function coefficients for DA
Table 6 CM for DA of temporal variations
Fig. 4
figure 4

The temporal variations of pH in Qujiang River

Fig. 5
figure 5

The temporal variations of CODMn in Qujiang River

Fig. 6
figure 6

The temporal variations of BOD5 in Qujiang River

Fig. 7
figure 7

The temporal variations of TP in Qujiang River

Spatial distribution characteristics of water quality in Qujiang River

Independent sample t-test was used to compare the spatial differences of water quality among monitoring sections. The results show significant differences among the pH, CODMn, NH3–N, TN, F.coli, and Q in the three monitoring sections during a five-years period (2015–2019) (Table 7).

Table 7 Comparison of spatial monitoring

The pH level is weakly alkaline and varied slightly in Tuanbaoling, Baita, and Sailong, ranging from7.11 to 8.74 mg/L, 7.50 to 8.64 mg/L, and 7.29 to 8.45 mg/L, respectively. The CODMn contents range from 1.20 to 5.10 mg/L, 2.04 to 5.50 mg/L, and 1.88 to 4.30 mg/L, respectively. All the values meet class 3 value of environmental quality standards for surface water (GB3838-2002, China). The corresponding NH3-N contents ranged from 0.06 to 0.68 mg/L, 0.12 to 0.74 mg/L, and 0.11 to 0.38 mg/L, respectively. The TN values at Tuanbaoling, Baita, and Sailong varied in the range of 1.11 to 2.88 mg/L, 0.79 to 3.20 mg/L, and 0.83 to 2.68 mg/L, respectively. In addition, significant differences were observed between Tuanbaoling and Baita and between Tuanbaoling and Sailong. The TN content exceed class 5 value of environmental quality standards for surface water. The F.coli values in Tuanbaoling, Sailong, and Baita ranged from 1115 to 92,000 N/L, 1700 to 24,000, and 490 to 24,000 N/L, respectively. Significant differences in F.coli and Q are found between Tuanbaoling and Baita and between Tuanbaoling and Sailong. In particular, the Q level at Tuanbaoling, Baita, and Sailong ranged from 58.80 to 1968.12 m3/s, 46.4 to 3750 m3/s, and 38.70 to 2530 m3/s, respectively.

Identification of source affecting water quality variations

The PC loadings are classified as strong, moderate, and weak, which correspond to absolute loading values of > 0.75, 0.75–0.50, and 0.50–0.30, respectively. The KMO is 0.642 and the significance level is 0 in this study, indicating significant relationships among variables and suitability for PCA. The result shows that the PCA of the three data sets four PCs for the dry season and flood season and five PCs for the flat season with eigenvalues > 1, thus explaining 58.23, 82.94, and 73.23% of the total variance, respectively (Table 8).

Table 8 Loadings of water parameters on significant principal components for dry, flat, and flood season

Discussion

Water quality evaluation of Qujiang River

TN and F.coli are water quality parameters with serious pollution in Qujiang River, especially TN (Fig. 2, Table 3). We found that the content of nitrogen nutrients of each sampling point in Qujiang River is high and exceed the class V value water standard of surface water for a long time, which shows that the content of nitrogen nutrients is not only affected by the time change caused by nature, but also affected by the discharge of long-term man-made activities (Barakat et al. 2016), especially the discharge of domestic sewage and industrial wastewater from agricultural production (livestock and poultry breeding) for a long time (Wang et al. 2015). In addition, previous studies have confirmed that atmospheric nitrogen deposition cannot be ignored and has become an important source of nitrogen load in water bodies. Atmospheric nitrogen deposition is also an important reason for the increase of nitrogen nutrient content in rivers (Liu et al. 2014). The number of F.coli in Tuanbaoling in 2019 exceeds class 3 value (GB3838-2002, China), which is affected by the discharge of local agricultural wastewater, especially the sewage generated by livestock and poultry breeding (Zhang et al. 2007; Zhang et al. 2019). The F.coli level is significantly higher in Tuanbaoling (12,947.00 N/L) than Baita (7428.33 N/L) and Sailong (5076.33 N/L), thus exceeding the class III value environmental quality standards for surface water. These results suggest that human and animal activities had a negative effect on water quality. Compared with Tuanbaoling and Sailong, the pollution in Baita section is generally more serious. Thus, the supervision and management in Baita should be strengthen. Moreover, the responsible body for the supervision of the section must optimize the regional land type reasonably, and improve the quality of water environment (Zhang and Jiang 2020; Wang et al. 2020a, b).

Identification of pollution sources of Qujiang River

In dry season, four PCs are obtained with eigenvalues more than 1 (Kaiser Normalization), which explain approximately 58.23% of the total variance for the dataset (Table 8). The first factor (PC1), which accounts for 18.73% of the total variance, has moderate positive loadings of DO, NH3-N, TN, and EC, moderate negative loading of WT. This component is in line with the nutrient content and it can be attributed to spring ploughing occurred, in which a large amount of nitrogen fertilizer would be applied in the soil, and the residual nitrogen fertilizer flows into the river with the surface runoff in this season (Guan et al. 2020; Zhang et al. 2019; Shrestha and Kazama 2007). The EC reflects the salinity in the river, and the river conductivity is high in most periods. TN and NH3-N are important factors affecting the growth of algae in river water, and NH3-N is also the toxicity index and oxygen consumption index of river water, and the change of water temperature is inversely proportional to DO (Kumar et al. 2022; Bharathi et al. 2022). PC2, which explains 16.05% of the variance, shows moderate negative loading on F.coli, moderate positive loading on CODMn. In addition, PC2 has strong positive loading on pH. This is due to non-point source pollution caused by human activities, especially the discharge of industrial and agricultural wastewater (Zeinalzadeh and Rezaei 2017). PC3 explains 13.13% of the total variance. It has a moderate positive loading on BOD5, TP, and F. Organic pollution and nutrient variables may be affected by industrial domestic source emissions (Varol et al. 2020). PC4, which accounts for 10.32% of the total variance, has a moderate positive loading on Q. PC1, PC3, and PC4 represent nutrition component and organic pollution, which is indicative of the mixed source of contamination comprising of natural processes as well as anthropogenic inputs, including the discharge of domestic sewage and industrial wastewater, especially phosphorous wastewater, which might increase the risk of eutrophication (Qian et al. 2021; Zhang et al. 2017).

In the flood season, among the total four significant PCs, PC1 accounts for 39.82% of the total variance, has strong positive loading on TP, TN, and F (loading  >  0.75), moderate positive loading on BOD5, NH3-N and Q. In addition, it has strong negative loading on WT. It can reflect the degree of eutrophication and organic pollution of the river, thus suggesting that the anthropogenic pollution from the industrial pollution and agricultural pollution (Zhao et al. 2012). At the same time, higher temperature promotes microbial activity in the water in this period. They mineralized the organic nitrogen and phosphorus in the sediment and transformed them into dissolved inorganic nitrogen and phosphorus into the overlying water (Jiang et al. 2008; ** et al. 2005). PC2 explains the 18.37% of the total variance. It has strong positive loading on DO, strong negative loading on pH, and moderate negative loadings on F.coli. This phenomenon is mainly due to the point source pollution caused by the industrial and agricultural wastewater discharged (Wang et al. 2012; Qian et al. 2021). PC3, explaining 15.85% of the total variance, contains strong positive loading on EC, moderate negative loading on CODMn. PC4 explains 8.90% of the total variance, which was the lowest variance. PC3 and PC4 show that natural processes and anthropogenic input caused mixed pollution to the water environment.

In the flat season, five PCs explain 73.23% of the total variances (Table 8). PC1, accounting for 23.87% of the total variance, shows strong positive loading on TP, moderate positive loadings on NH3-N and Q, and moderate negative loading on WT. PC2 explains 18.33% of the total variance, including moderate positive loading on BOD5, F, and EC, moderate negative loading on CODMn. PC1 and PC2 represent organic pollution in this period, which were affected mainly by the discharge of the pollution from people’s daily activities and agricultural livestock (Zhang et al. 2017; Liu et al. 2020). PC3 (12.45% of total variance) has moderate positive loading on pH, DO and F.coli. Meanwhile, PC4 account for 9.72% of the total variance, has a moderate positive loading on pH. PC3 and PC4 mainly represent the industrial pollution along the Qujiang River Basin. PC5, accounting for 8.86% of the total variance, shows positive loading on WT. Therefore, the results imply that the main problem in the Qujiang River is the high water organic and nitrogen nutrient pollutant content.

Conclusion

The results show that the water quality of Qujiang River basically meet the class 3 value of environmental quality standards for surface water (GB3838-2002, China). CPI varies from 0.62 to 1.06 and the water quality is characterized by slight pollution at the three monitoring sections. The months can be divided into three groups on the basis of similarities of the water quality characteristics: Group 1 (dry season), which includes January-April and November–December; Group 2 (flood season), that is, July; Group 3 (flat season), which consists of May–June and August–October. PCA identified four principal components (PCs) for the dry season and flood season, and five PCs for the flat season, thus explaining 58.23, 82.94, and 73.23% of the total variance, respectively. Moreover, the pollution is more serious in Baita than Tuanbaoling and Sailong section and the results suggest that the main pollution in the Qujiang River is the high water organic and nitrogen nutrient content. Hence, monitoring and protection need to be strengthened in the Baita section of Qujiang River. Nevertheless, the Qujiang River selected for this study belongs to the Jialing River system, which mainly flows through Sichuan Province. Most of the basins in the basin are basins, so the study is not universal and there are certain limitations on the topography in the basin. Therefore, in the future research, more types of rivers should be covered as much as possible, and the spatial and temporal variation characteristics of water quality in different regions should be compared and analyzed, so as to provide important reference for the management and protection of water resources in China and even the world.