Towards Guidelines for Assessing Qualities of Machine Learning Systems

  • Conference paper
  • First Online:
Quality of Information and Communications Technology (QUATIC 2020)

Abstract

Nowadays, systems containing components based on machine learning (ML) methods are becoming more widespread. In order to ensure the intended behavior of a software system, there are standards that define necessary quality aspects of the system and its components (such as ISO/IEC 25010). Due to the different nature of ML, we have to adjust quality aspects or add additional ones (such as trustworthiness) and be very precise about which aspect is really relevant for which object of interest (such as completeness of training data), and how to objectively assess adherence to quality requirements. In this article, we present the construction of a quality model (i.e., evaluation objects, quality aspects, and metrics) for an ML system based on an industrial use case. This quality model enables practitioners to specify and assess quality requirements for such kinds of ML systems objectively. In the future, we want to learn how the term quality differs between different types of ML systems and come up with general guidelines for specifying and assessing qualities of ML systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
EUR 29.95
Price includes VAT (Germany)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR 42.79
Price includes VAT (Germany)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR 53.49
Price includes VAT (Germany)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free ship** worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Wan, Z., **a, X., Lo, D., Murphy, G.C.: How does machine learning change software development practices? IEEE Trans. Softw. Eng. 1 (2019)

    Google Scholar 

  2. Sculley, D., et al.: Hidden technical debt in machine learning systems. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2503–2511 (2015)

    Google Scholar 

  3. Zhang, J.M., Harman, M., Ma, L., Liu, Y.: Machine learning testing: survey, landscapes and horizons. IEEE Trans. Softw. Eng. 1 (2020)

    Google Scholar 

  4. ISO/IEC 25010:2011: Systems and software engineering—Systems and software Quality Requirements and Evaluation (SQuaRE)—System and software quality models

    Google Scholar 

  5. ISO/TS 8000:2011: Data Quality

    Google Scholar 

  6. High-Level Expert Group on Artificial Intelligence: Ethics Guidelines for Trustworthy AI. European Commission (2019)

    Google Scholar 

  7. DIN SPEC 92001-01: Künstliche Intelligenz - Life Cycle Prozesse und Qualitätsanforderungen. Teil 1: Qualitäts-Meta-Modell. Beuth Verlag GmbH, Berlin

    Google Scholar 

  8. Hamada, K., Ishikawa, F., Masuda, S., Matsuya, M., Ujita, Y.: Guidelines for quality assurance of machine learning-based artificial intelligence. In: SEKE2020: the 32nd International Conference on Software Engineering & Knowledge Engineering, pp. 335–341 (2020)

    Google Scholar 

  9. Trustworthy Use of Artificial Intelligence. Priorities from a Philosophical, Ethical, Legal, and Technological Viewpoint as a Basis for Certification of Artificial Intelligence. Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS Schloss Birlinghoven (2019)

    Google Scholar 

  10. From Principles to Practice. An interdisciplinary framework to operationalise AI ethics. VDE, Bertelsmann Stiftung (2020)

    Google Scholar 

  11. Marselis, R., Shaukat, H., Gansel, T.: Testing of Artificial Intelligence. Sogeti, Paris (2017)

    Google Scholar 

  12. Marselis, R., Shaukat, H.: Machine Intelligence Quality Characteristics. How to Measure the Quality of Artificial Intelligence and Robotics. Sogeti, Paris (2018)

    Google Scholar 

  13. Nakajima, S.: Quality assurance of machine learning software. In: 2018 IEEE 7th Global Conference on Consumer Electronics (GCCE), 9–12 October 2018, pp. 601–604. IEEE, Piscataway (2018)

    Google Scholar 

  14. Mariscal, G., Marbán, Ó., Fernández, C.: A survey of data mining and knowledge discovery process models and methodologies. Knowl. Eng. Rev. 25, 137–166 (2010)

    Article  Google Scholar 

  15. Martinez-Plumed, F., et al.: CRISP-DM twenty years later: from data mining processes to data science trajectories. IEEE Trans. Knowl. Data Eng. 1 (2020)

    Google Scholar 

  16. Lwakatare, L.E., Raj, A., Bosch, J., Olsson, H.H., Crnkovic, I.: A taxonomy of software engineering challenges for machine learning systems: an empirical investigation. In: Kruchten, P., Fraser, S., Coallier, F. (eds.) XP 2019. LNBIP, vol. 355, pp. 227–243. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-19034-7_14

    Chapter  Google Scholar 

  17. Amershi, S., et al.: Software engineering for machine learning: a case study. In: 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 291–300 (2019)

    Google Scholar 

  18. Hossin, M., Sulaiman, M.N.: A review on evaluation metrics for data classification evaluations. IJDKP 5, 1–11 (2015)

    Google Scholar 

  19. Emmons, S., Kobourov, S., Gallant, M., Börner, K.: Analysis of network clustering algorithms and cluster quality metrics at scale. PLoS ONE 11, e0159161 (2016)

    Article  Google Scholar 

  20. Barocas, S., Boyd, D.: Engaging the ethics of data science in practice. Commun. ACM 60, 23–25 (2017)

    Article  Google Scholar 

  21. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent Trade-Offs in the Fair Determination of Risk Scores. ar**v.org (2016)

    Google Scholar 

  22. Wagner, S., et al.: Operationalised product quality models and assessment: the Quamoco approach. Inf. Softw. Technol. 62, 101–123 (2015)

    Article  Google Scholar 

  23. Kaufman, S., Rosset, S., Perlich, C.: Leakage in data mining. In: Apte, C., Ghosh, J., Smyth, P. (eds.) Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, Ca, USA, 21–24 August 2011, p. 556. ACM, New York (2011)

    Google Scholar 

  24. Kläs, M., Vollmer, A.M.: Uncertainty in machine learning applications: a practice-driven classification of uncertainty. In: Gallina, B., Skavhaug, A., Schoitsch, E., Bitsch, F. (eds.) SAFECOMP 2018. LNCS, vol. 11094, pp. 431–438. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99229-7_36

    Chapter  Google Scholar 

  25. Nakamichi, K., et al.: Requirements-driven method to determine quality characteristics and measurements for machine learning software and its evaluation. In: 28th IEEE International Requirements Engineering Conference (RE’20)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julien Siebert .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Siebert, J. et al. (2020). Towards Guidelines for Assessing Qualities of Machine Learning Systems. In: Shepperd, M., Brito e Abreu, F., Rodrigues da Silva, A., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2020. Communications in Computer and Information Science, vol 1266. Springer, Cham. https://doi.org/10.1007/978-3-030-58793-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-58793-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-58792-5

  • Online ISBN: 978-3-030-58793-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Navigation