Abstract
There is no life without water. All humans, plants, and animals need water to live. It is important to know if drinking water, a resource of human life, will be enough for everyone now and in the future. Access to clean water and hygiene is an important human right and part of the health safety policy. At the national, state, and local levels, clean water is a critical problem for health and development. This work's primary goal is to use various modeling techniques based on machine learning, deep learning, and ensemble learning to measure water quality using hyperparameter tuning of each algorithm. We have used SVM, RF, XGBoost, DT, and LGBM model stacking and voting ensemble for efficient and fast prediction. PH, chloramines, hardness, solids, sulfate, organic carbon, conductivity, trihalomethane, turbidity, and potability were the parameter used as a feature vector. A different machine, deep, and ensemble learning algorithm was used to evaluate water prediction, and the effects are compared on the accuracy, ROC AUC values, precision, recall, F1-score, MCC, and kappa score. In addition, the Freidman Ranking is also used to evaluate the model's efficiency. According to related studies, ensemble learning-based models are the most effective.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pooja A (2017) Physical, chemical and biological characteristics of water
Kaddoura S (2022) Evaluation of machine learning algorithm on drinking water quality for better sustainability. Sustainability 14:11478
Chen Y, Song L, Liu Y, Yang L, Li D (2020) A review of the artificial neural network models for water quality prediction. Appl Sci 10:5776. https://doi.org/10.3390/APP10175776.
Li Y, Wang D, Wei J, Li B, Xu B, Xu Y, Huang H (2021) A medium and long-term runoff forecast method based on massive meteorological data and machine learning algorithms. Water 13:1308. https://doi.org/10.3390/W13091308
Abuzir SY, Abuzir YS (2022) Machine learning for water quality classification. Water Qual Res J 57:152–164. https://doi.org/10.2166/WQRJ.2022.004
**e F, Tao Z, Zhou X, Lv T, Wang J, Li R (2020) A prediction model of water in situ data change under the influence of environmental variables in remote sensing validation. Rem Sens 13:70. https://doi.org/10.3390/RS13010070.
Nazeer M, Nichol JE (2016) Development and application of a remote sensing-based Chlorophyll-a concentration prediction model for complex coastal waters of Hong Kong. J Hydrol (Amst). 532:80–89. https://doi.org/10.1016/j.jhydrol.2015.11.037
Makhtar M, Rozaimee A, Aziz AA, Muhammad SY, Jamal AA (2015) Classification model for water quality using machine learning techniques. researchgate.netSY Muhammad, M Makhtar, A Rozaimee, AA Aziz, AA Jamal Int J Softw Eng Appl 2015•researchgate.net. 9:45–52. https://doi.org/10.14257/ijseia.2015.9.6.05
Yaroshenko I, Kirsanov D, Marjanovic M, Lieberzeit PA, Korostynska O, Mason A, Frau I, Legin A (2020) Real-time water quality monitoring with chemical sensors. Sensors 20:3432. https://doi.org/10.3390/S20123432
Zhai A, Fan G, Ding X, Water GH (2022) Undefined: Regression tree ensemble rainfall–runoff forecasting model and its application to **angxi River, China. mdpi.comA Zhai, G Fan, X Ding, G HuangWater, 2022•mdpi.com
Lu H, Chemosphere XM (2020) Undefined: Hybrid decision tree-based machine learning models for short-term water quality prediction. Elsevier
Haghiabi A, Nasrolahi AN (2018) Undefined: water quality prediction using machine learning methods. iwaponline. Water Qual Res J iwaponline.com
gymprathap/water-quality-dataset | Workspace | data.world, https://data.world/gymprathap/water-quality-dataset/workspace/data-dictionary. Accessed 2023/08/08
Jr DH, Lemeshow S, Sturdivant R (2013) Applied logistic regression
Maxwell AE, Warner TA, Fang F (2018) Implementation of machine-learning classification in remote sensing: an applied review. Int J Remote Sens 39:2784–2817. https://doi.org/10.1080/01431161.2018.1433343/SUPPL_FILE/TRES_A_1433343_SM5998.ZIP
Aa HZ (2004) Undefined: The optimality of naive Bayes. cs.unb.caH ZhangAa
Swain PH, Hauska H (1997) Decision tree classifier: design and potential. IEEE Trans Geosci Electron. GE-15:142–147. https://doi.org/10.1109/TGE.1977.6498972
Günther F, Fritsch S (2010) Neuralnet: training of neural networks. R J 2:30–38. https://doi.org/10.32614/RJ-2010-006
Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636. https://doi.org/10.1016/S1352-2310(97)00447-0
Liaw A, News MWR (2002) Undefined: Classification and regression by randomForest. journal.r-project.orgA Liaw, M WienerR news, 2002•journal.r-project.org.
Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42
Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, Mitchell R, Cano I, Zhou T (2015) Xgboost: extreme gradient boosting. R package version 0.4–2. 1:1–4
Shamreen Ahamed B, Sumeet Arya M (2021) Prediction of Type—2 diabetes using the LGBM classifier methods and techniques. Turkish J Comput Math Educ (TURCOMAT) 12:223–231
Kumari S, Singh SK (2022) An ensemble learning-based model for effective chronic kidney disease prediction. In: 3rd IEEE 2022 international conference on computing, communication, and intelligent systems, ICCCIS 2022, pp 162–168. https://doi.org/10.1109/ICCCIS56430.2022.10037698
Rani S, Kumari P, Singh SK (2023) Machine learning-based multiclass classification model for effective air quality prediction. 1–7. https://doi.org/10.1109/GLOBCONET56651.2023.10149947
Heydarian M, Doyle TE, Samavi R (2022) MLCM: multi-label confusion matrix. IEEE Access 10:19083–19095. https://doi.org/10.1109/ACCESS.2022.3151048
Chicco D, Warrens MJ, Jurman G (2021) The Matthews correlation coefficient (MCC) is more informative than Cohen’s kappa and brier score in binary classification assessment. IEEE Access 9:78368–78381. https://doi.org/10.1109/ACCESS.2021.3084050
GarcÃa S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044–2064. https://doi.org/10.1016/J.INS.2009.12.010
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumari, M., Singh, S.K. (2024). Water Quality Classification Using Machine Learning Techniques. In: Shaw, R.N., Siano, P., Makhilef, S., Ghosh, A., Shimi, S.L. (eds) Innovations in Electrical and Electronic Engineering. ICEEE 2023. Lecture Notes in Electrical Engineering, vol 1115. Springer, Singapore. https://doi.org/10.1007/978-981-99-8661-3_15
Download citation
DOI: https://doi.org/10.1007/978-981-99-8661-3_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8660-6
Online ISBN: 978-981-99-8661-3
eBook Packages: EnergyEnergy (R0)