Water Quality Classification Using Machine Learning Techniques

  • Conference paper
  • First Online:
Innovations in Electrical and Electronic Engineering (ICEEE 2023)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1115))

Included in the following conference series:

  • 150 Accesses

Abstract

There is no life without water. All humans, plants, and animals need water to live. It is important to know if drinking water, a resource of human life, will be enough for everyone now and in the future. Access to clean water and hygiene is an important human right and part of the health safety policy. At the national, state, and local levels, clean water is a critical problem for health and development. This work's primary goal is to use various modeling techniques based on machine learning, deep learning, and ensemble learning to measure water quality using hyperparameter tuning of each algorithm. We have used SVM, RF, XGBoost, DT, and LGBM model stacking and voting ensemble for efficient and fast prediction. PH, chloramines, hardness, solids, sulfate, organic carbon, conductivity, trihalomethane, turbidity, and potability were the parameter used as a feature vector. A different machine, deep, and ensemble learning algorithm was used to evaluate water prediction, and the effects are compared on the accuracy, ROC AUC values, precision, recall, F1-score, MCC, and kappa score. In addition, the Freidman Ranking is also used to evaluate the model's efficiency. According to related studies, ensemble learning-based models are the most effective.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 181.89
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 235.39
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.usbr.gov/mp/arwec/water-facts-ww-water-sup.html.

  2. 2.

    https://www.unicef.org/wash.

References

  1. Pooja A (2017) Physical, chemical and biological characteristics of water

    Google Scholar 

  2. Kaddoura S (2022) Evaluation of machine learning algorithm on drinking water quality for better sustainability. Sustainability 14:11478

    Article  MATH  Google Scholar 

  3. Chen Y, Song L, Liu Y, Yang L, Li D (2020) A review of the artificial neural network models for water quality prediction. Appl Sci 10:5776. https://doi.org/10.3390/APP10175776.

  4. Li Y, Wang D, Wei J, Li B, Xu B, Xu Y, Huang H (2021) A medium and long-term runoff forecast method based on massive meteorological data and machine learning algorithms. Water 13:1308. https://doi.org/10.3390/W13091308

  5. Abuzir SY, Abuzir YS (2022) Machine learning for water quality classification. Water Qual Res J 57:152–164. https://doi.org/10.2166/WQRJ.2022.004

    Article  Google Scholar 

  6. **e F, Tao Z, Zhou X, Lv T, Wang J, Li R (2020) A prediction model of water in situ data change under the influence of environmental variables in remote sensing validation. Rem Sens 13:70. https://doi.org/10.3390/RS13010070.

  7. Nazeer M, Nichol JE (2016) Development and application of a remote sensing-based Chlorophyll-a concentration prediction model for complex coastal waters of Hong Kong. J Hydrol (Amst). 532:80–89. https://doi.org/10.1016/j.jhydrol.2015.11.037

    Article  Google Scholar 

  8. Makhtar M, Rozaimee A, Aziz AA, Muhammad SY, Jamal AA (2015) Classification model for water quality using machine learning techniques. researchgate.netSY Muhammad, M Makhtar, A Rozaimee, AA Aziz, AA Jamal Int J Softw Eng Appl 2015•researchgate.net. 9:45–52. https://doi.org/10.14257/ijseia.2015.9.6.05

  9. Yaroshenko I, Kirsanov D, Marjanovic M, Lieberzeit PA, Korostynska O, Mason A, Frau I, Legin A (2020) Real-time water quality monitoring with chemical sensors. Sensors 20:3432. https://doi.org/10.3390/S20123432

  10. Zhai A, Fan G, Ding X, Water GH (2022) Undefined: Regression tree ensemble rainfall–runoff forecasting model and its application to **angxi River, China. mdpi.comA Zhai, G Fan, X Ding, G HuangWater, 2022•mdpi.com

    Google Scholar 

  11. Lu H, Chemosphere XM (2020) Undefined: Hybrid decision tree-based machine learning models for short-term water quality prediction. Elsevier

    Google Scholar 

  12. Haghiabi A, Nasrolahi AN (2018) Undefined: water quality prediction using machine learning methods. iwaponline. Water Qual Res J iwaponline.com

    Google Scholar 

  13. gymprathap/water-quality-dataset | Workspace | data.world, https://data.world/gymprathap/water-quality-dataset/workspace/data-dictionary. Accessed 2023/08/08

  14. Jr DH, Lemeshow S, Sturdivant R (2013) Applied logistic regression

    Google Scholar 

  15. Maxwell AE, Warner TA, Fang F (2018) Implementation of machine-learning classification in remote sensing: an applied review. Int J Remote Sens 39:2784–2817. https://doi.org/10.1080/01431161.2018.1433343/SUPPL_FILE/TRES_A_1433343_SM5998.ZIP

    Article  MATH  Google Scholar 

  16. Aa HZ (2004) Undefined: The optimality of naive Bayes. cs.unb.caH ZhangAa

    Google Scholar 

  17. Swain PH, Hauska H (1997) Decision tree classifier: design and potential. IEEE Trans Geosci Electron. GE-15:142–147. https://doi.org/10.1109/TGE.1977.6498972

  18. Günther F, Fritsch S (2010) Neuralnet: training of neural networks. R J 2:30–38. https://doi.org/10.32614/RJ-2010-006

  19. Gardner MW, Dorling SR (1998) Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmos Environ 32:2627–2636. https://doi.org/10.1016/S1352-2310(97)00447-0

    Article  MATH  Google Scholar 

  20. Liaw A, News MWR (2002) Undefined: Classification and regression by randomForest. journal.r-project.orgA Liaw, M WienerR news, 2002•journal.r-project.org.

    Google Scholar 

  21. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63:3–42

    Article  MATH  Google Scholar 

  22. Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, Mitchell R, Cano I, Zhou T (2015) Xgboost: extreme gradient boosting. R package version 0.4–2. 1:1–4

    Google Scholar 

  23. Shamreen Ahamed B, Sumeet Arya M (2021) Prediction of Type—2 diabetes using the LGBM classifier methods and techniques. Turkish J Comput Math Educ (TURCOMAT) 12:223–231

    MATH  Google Scholar 

  24. Kumari S, Singh SK (2022) An ensemble learning-based model for effective chronic kidney disease prediction. In: 3rd IEEE 2022 international conference on computing, communication, and intelligent systems, ICCCIS 2022, pp 162–168. https://doi.org/10.1109/ICCCIS56430.2022.10037698

  25. Rani S, Kumari P, Singh SK (2023) Machine learning-based multiclass classification model for effective air quality prediction. 1–7. https://doi.org/10.1109/GLOBCONET56651.2023.10149947

  26. Heydarian M, Doyle TE, Samavi R (2022) MLCM: multi-label confusion matrix. IEEE Access 10:19083–19095. https://doi.org/10.1109/ACCESS.2022.3151048

    Article  Google Scholar 

  27. Chicco D, Warrens MJ, Jurman G (2021) The Matthews correlation coefficient (MCC) is more informative than Cohen’s kappa and brier score in binary classification assessment. IEEE Access 9:78368–78381. https://doi.org/10.1109/ACCESS.2021.3084050

    Article  Google Scholar 

  28. García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044–2064. https://doi.org/10.1016/J.INS.2009.12.010

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunil Kumar Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kumari, M., Singh, S.K. (2024). Water Quality Classification Using Machine Learning Techniques. In: Shaw, R.N., Siano, P., Makhilef, S., Ghosh, A., Shimi, S.L. (eds) Innovations in Electrical and Electronic Engineering. ICEEE 2023. Lecture Notes in Electrical Engineering, vol 1115. Springer, Singapore. https://doi.org/10.1007/978-981-99-8661-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-8661-3_15

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-8660-6

  • Online ISBN: 978-981-99-8661-3

  • eBook Packages: EnergyEnergy (R0)

Publish with us

Policies and ethics

Navigation