Abstract
This paper explores how to develop machine learning regression models that are more explainable and transparent for the end-user. Explainable regression models can be created by rank-ordering the features of the regression model that contribute most to predictive accuracy. In addition, fitting graphs can be generated that show how the addition of each feature in a regression model incrementally improves predictive accuracy. These information graphics are especially useful in understanding the tradeoffs involved in selecting a model that considers both model complexity and model performance. These methods are illustrated with two examples: a multiple regression model using a straightforward application of machine learning regression; and a more complex polynomial regression model that captures higher-order terms and interactions among all variables in the model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The Ames housing dataset, with only 1460 observations, was small enough to performed repeated k-fold CV. For larger datasets, the computational costs of performing repeated k-fold CV might outweigh the benefits of obtaining more accurate estimates of RMSE.
References
James, G., Witten, D., Hastie, T., Tibshirani, R.: An introduction to statistical learning. Springer, New York (2013)
Allison, P.D.: Multiple regression: A primer. Pine Forge Press, Thousand Oaks, CA (1999)
Montgomery, D.C., Peck, E.A., Vining, G.G.: Introduction to linear regression analysis. John Wiley & Sons, Hoboken, NJ (2021)
Kuhn, M., Johnson, K.: Applied predictive modeling. Springer, New York (2013)
Confalonieri, R., Coba, L., Wagner, B., Besold, T.R.: A historical perspective of explainable artificial intelligence. Wiley Interdisciplinary Rev. Data Mining Knowl. Dis. 11(1), e1391 (2021)
Burkart, N., Huber, M.F.: A survey on the explainability of supervised machine learning. J. Artifi. Intell. Res. 70, 245–317 (2021)
Angelov, P.P., Soares, E.A., Jiang, R., Arnold, N.I., Atkinson, P.M.: Explainable artificial intelligence: an analytical review. Wiley Interdisciplinary Rev, Data Mining Knowl. Dis. 11(5), e1424 (2021)
Tjoa, E., Guan, C.: A survey on explainable artificial intelligence (xai): toward medical xai. IEEE Trans. Neural Netw. Learn. Syst. 32(11), 4793–4813 (2020)
Buchanan, B.G., Shortliffe, E.H.: Rule-based expert systems: the MYCIN experiments of the Stanford heuristic programming project. Addison-Wesley, (1984)
Swartout, W.R.: XPLAIN: A system for creating and explaining expert consulting programs. Artif. Intell. 21(3), 285–325 (1983)
Chandrasekaran, B., Tanner, M.C., Josephson, J.R.: Explaining control strategies in problem solving. IEEE Intell. Syst. 4(1), 9–15 (1989)
De Cock, D.: Ames, Iowa: Alternative to the Boston housing data as an end of semester regression project. J. Statist. Educ. 19(3) (2011)
Provost, F., Fawcett, T.: Data science for business: what you need to know about data mining and data-analytic thinking. Sebastopol, CA: O'Reilly Media (2013)
Nakatsu, R.T.: Information visualizations used to avoid the problem of overfitting in supervised machine learning. In: Nah, F.H., Tan, C.H. (eds.) HCIBGO 2017. LNCS, vol. 10294, pp. 373–385. Springer International Publishing (2017). https://doi.org/10.1007/978-3-319-58484-3_29
Stone, M.: Cross-validatory choice and assessment of statistical predictions. J. Roy. Stat. Soc.: Ser. B (Methodol.) 36(2), 111–133 (1974)
Geisser, S.: The predictive sample reuse method with applications. J. Am. Stat. Assoc. 70(350), 320–328 (1975)
Molinaro, A.M., Simon, R., Pfeiffer, R.M.: Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15), 3301–3307 (2005)
Nakatsu, R.T.: An evaluation of four resampling methods used in machine learning classification. IEEE Intell. Syst. 36(3), 51–57 (2021)
Nakatsu, R.T.: Validation of machine learning ridge regression models using Monte Carlo, bootstrap, and variations in cross-validation. J. Intell. Syst. 32(1), 1–18 (2023)
Varshney, K.R.: Trustworthy machine learning. Independently Published, Chappaqua, NY (2022)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Nakatsu, R.T. (2024). Explainable AI in Machine Learning Regression: Creating Transparency of a Regression Model. In: Nah, F.FH., Siau, K.L. (eds) HCI in Business, Government and Organizations. HCII 2024. Lecture Notes in Computer Science, vol 14720. Springer, Cham. https://doi.org/10.1007/978-3-031-61315-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-61315-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-61314-2
Online ISBN: 978-3-031-61315-9
eBook Packages: Computer ScienceComputer Science (R0)