The Applications of Machine Learning in Accounting and Auditing Research

  • Living reference work entry
  • First Online:
Encyclopedia of Finance
  • 438 Accesses

Abstract

The term “machine learning” has become a buzzword in the past few years. In accounting and auditing area, while this technology has been used in major accounting firms such as Big 4 s, its research is still evolving. Increased use of machine learning and other artificial intelligence techniques will allow accountants to focus on providing better decision support instead of on data gathering and manual analyses. This entry introduces machine learning as compared to traditional statistical modeling, discusses its current applications in accounting and auditing research, and provides directions for future research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The data source is https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data. In this application case, explanatory variables including room types, geographical availability, and the number of reviews per month are treated as the independent variables to predict prices.

  2. 2.

    The data source is http://yann.lecun.com/exdb/mnist/

  3. 3.

    One of the examples in this domain is in https://www.kaggle.com/vjchoudhary7/customer-segmentation-tutorial-in-python. The task of this data set is to segment customers based on their behavioral related attributes by applying the K-means clustering algorithms.

  4. 4.

    A bootstrap replicate is accessed by randomly sampling the training set with replacement. This operation will generate a new training set with size equal to that of the original one.

  5. 5.

    The statistical problem means that the training set fails to provide adequate information to select one single learner within the circumstance that multiple unique learners can achieve the same accuracy on the training set.

  6. 6.

    Searching for the best hypothesis (e.g., neutral network) that fits in the training data may be computationally intractable.

  7. 7.

    The approximations to the real target function, which are generated from single learners, may not be ideal.

  8. 8.

    “Complexity” means that the decision tree model generates a plethora of rules, resulting the overfitting issues.

  9. 9.

    For details of the cross-validation method, check the website: https://scikit-learn.org/stable/modules/cross_validation.html

  10. 10.

    More details are presented on the website: https://www.tensorflow.org/

  11. 11.

    Shapley, Lloyd S. (August 21, 1951). “Notes on the n-Person Game -- II: The Value of an n-Person Game” (PDF). Santa Monica, Calif.: RAND Corporation.

  12. 12.

    Read “Cooperative game theory assumes that groups of players, called coalitions, are the primary units of decision-making, and may enforce cooperative behavior.” (Choudhary 2019). https://www.analyticsvidhya.com/blog/2019/11/shapley-value-machine-learning-interpretability-game-theory/

  13. 13.

    Local surrogate models are interpretable models that are used to explain individual predictions of black box machine learning models.

References

  • Adadi, A., and M. Berrada. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6: 52138–52160.

    Article  Google Scholar 

  • Agrawal, R., T. Imieliński, and A. Swami. 1993. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp 207–216.

    Google Scholar 

  • Alpaydin, E. 2020. Introduction to machine learning. Cham: MIT Press.

    Google Scholar 

  • Altman, E.I. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance 23: 589–609.

    Google Scholar 

  • Analyticsai, C. 2020. AnalyticsAI for every engagement [Online]. Available: https://www.caseware.com/us/analyticsai. Accessed.

  • Anand, V., R. Brunner, K. Ikegwu, and T. Sougiannis. 2019. Predicting profitability using machine learning. Available at SSRN 3466478.

    Google Scholar 

  • Anthony, M., and P.L. Bartlett. 2009. Neural network learning: Theoretical foundations. Cambridge: Cambridge University Press.

    Google Scholar 

  • Apley, D.W. 2016. Visualizing the effects of predictor variables in black box supervised learning models. ar**v preprint ar**v:1612.08468.

    Google Scholar 

  • Apley, D.W., and J. Zhu. 2020. Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society, Series B: Statistical Methodology 82 (4): 1059–1086.

    Google Scholar 

  • Bao, Y., and A. Datta. 2014. Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Science 60: 1371–1391.

    Google Scholar 

  • Bao, Y., B. Ke, B. Li, Y.J. Yu, and J. Zhang. 2020. Detecting accounting fraud in publicly traded US firms using a machine learning approach. Journal of Accounting Research 58 (1): 199–235.

    Google Scholar 

  • Barboza, F., H. Kimura, and E. Altman. 2017. Machine learning models and bankruptcy prediction. Expert Systems with Applications 83: 405–417.

    Google Scholar 

  • Barth, Mary E. and Li, Ken and McClure, Charles. 2021. Evolution in Value Relevance of Accounting Information. Stanford University Graduate School of Business Research Paper No. 17-24, Available at SSRN: https://ssrn.com/abstract=2933197 or https://doi.org/10.2139/ssrn.2933197.

  • Beneish, M.D. 1999. The detection of earnings manipulation. Financial Analysts Journal 55: 24–36.

    Google Scholar 

  • Bertomeu, J. 2020. Machine learning improves accounting: Discussion, implementation and research opportunities. Review of Accounting Studies 25: 1135–1155.

    Google Scholar 

  • Bertomeu, J., E. Cheynel, E. Floyd, and W. Pan. 2020. Using machine learning to detect misstatements. Review of Accounting Studies 26: 1–52.

    Google Scholar 

  • Bishop, C.M. 2006. Pattern recognition and machine learning. Springer.

    Google Scholar 

  • Breiman, L. 1996. Bagging predictors. Machine Learning 24: 123–140.

    Google Scholar 

  • Brown, N.C., R.M. Crowley, and W.B. Elliott. 2020. What are you saying? Using topic to detect financial misreporting. Journal of Accounting Research 58 (1): 237–291.

    Google Scholar 

  • Brown-Liburd, H., A. Cheong, M.A. Vasarhelyi, and X. Wang. 2019. Measuring with exogenous data (MED), and government economic monitoring (GEM). Journal of Emerging Technologies in Accounting. 16 (1): 1–19.

    Google Scholar 

  • Bzdok, D., N. Altman, and M. Krzywinski. 2018. Points of significance: Statistics versus machine learning. Nature Methods 15 (4): 233–234. https://www.nature.com/articles/nmeth.4642.pdf?origin=ppub.

    Google Scholar 

  • Carton, R.B., and C.W. Hofer. 2006. Measuring organizational performance: Metrics for entrepreneurship and strategic management research. Edward Elgar Publishing.

    Google Scholar 

  • Cecchini, M., H. Aytug, G.J. Koehler, and P. Pathak. 2010a. Making words work: Using financial text as a predictor of financial events. Decision Support Systems 50: 164–175.

    Google Scholar 

  • Cecchini, A., et al. 2010b. Detecting management fraud in public companies. Management Science 56 (7): 1146–1160. https://doi.org/10.1287/mnsc.1100.1174.

    Article  Google Scholar 

  • Chen, M.-S., J. Han, and P.S. Yu. 1996. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering 8: 866–883.

    Google Scholar 

  • Cho, S., M.A. Vasarhelyi, T. Sun, and C. Zhang. 2020. Learning from machine learning in accounting and assurance. Journal of Emerging Technologies in Accounting.

    Google Scholar 

  • Chollet, F. 2017. Deep learning with python. Shelter Island: Manning Publications Company.

    Google Scholar 

  • Choudhary, A. 2019. Analytics vidhya. A unique method for machine learning interpretability: Game theory & shapley values. https://www.analyticsvidhya.com/blog/2019/11/shapley-value-machine-learning-interpretability-game-theory/.

  • Dechow, P.M., and I.D. Dichev. 2002. The quality of accruals and earnings: The role of accrual estimation errors. The Accounting Review 77: 35–59.

    Google Scholar 

  • Dechow, P.M., W. Ge, C.R. Larson, and R.G. Sloan. 2011. Predicting material accounting misstatements. Contemporary Accounting Research 28 (1): 17–82.

    Google Scholar 

  • Dietterich, T.G. 2002. Ensemble learning. In The handbook of brain theory and neural networks, vol. 2, 110–125. Cambridge, MA: MIT Press.

    Google Scholar 

  • Ding, K., B. Lev, X. Peng, T. Sun, and M.A. Vasarhelyi. 2020. Machine learning improves accounting estimates: Evidence from insurance payments. Available at SSRN 3253220.

    Google Scholar 

  • Expert.ai. 2020. What is machine learning? A definition. https://www.expert.ai/blog/machine-learning-definition/

  • Foote, K.D. 2019. A brief history of machine learning. Data Topics. Dataversity. https://www.dataversity.net/a-brief-history-of-machine-learning/

  • Frankel, R., J. Jennings, and J. Lee. 2016. Using unstructured and qualitative disclosures to explain accruals. Journal of Accounting and Economics 62: 209–227.

    Google Scholar 

  • Ghahramani, Z. 2015. Probabilistic machine learning and artificial intelligence. Nature 521 (7553): 452–459. https://www.repository.cam.ac.uk/bitstream/handle/1810/248538/Ghahramani%202015%20Nature.pdf;jsessionid=3DB2D31FFA80196A97AEEBECB06FEF42?sequence=1.

    Google Scholar 

  • Goel, S., J. Gangolly, S.R. Faerman, and O. Uzuner. 2010. Can linguistic predictors detect fraudulent financial filings. Journal of Emerging Technologies in Accounting. 7: 25–46.

    Google Scholar 

  • Hammond, K. 2016. 5 unexpected sources of bias in artificial intelligence. Available at: https://techcrunch.com/2016/12/10/5-unexpected-sources-of-bias-in-artificial-intelligence/

  • Healthcare.ai. 2020. Machine learning versus statistics: When to use each. Data Science Blog. https://healthcare.ai/machine-learning-versus-statistics-use/

  • Hebb, D.O. 1949. The organization of behavior: A neuropsychological theory. New York, London: J. Wiley, Chapman & Hall. http://s-f-walker.org.uk/pubsebooks/pdfs/The_Organization_of_Behavior-Donald_O._Hebb.pdf.

    Google Scholar 

  • Heller, M. 2019. Machine learning algorithms explained [Online]. Available: https://www.infoworld.com/article/3394399/machine-learning-algorithms-explained.html. Accessed.

  • Huang, X.S., and L. Sun. 2017. Managerial ability and real earnings management. Advances in Accounting 39: 91–104.

    Google Scholar 

  • Huang, A.H., A.Y. Zang, and R. Zheng. 2014. Evidence on the information content of text in analyst reports. The Accounting Review 89: 2151–2180.

    Google Scholar 

  • Hu, H., T. Sun, M.A. Vasarhelyi, and M. Zhang. 2020. A Machine Learning Approach of Measuring Audit Quality: Evidence From China. Available at SSRN 3732563.

    Google Scholar 

  • Huang, A.H., R. Lehavy, A.Y. Zang, and R. Zheng. 2018. Analyst information discovery and interpretation roles: A topic modeling approach. Management Science 64: 2833–2855.

    Google Scholar 

  • Hunt, J.O., D.M. Rosser, and S.P. Rowe. 2021. Using machine learning to predict auditor switches: How the likelihood of switching affects audit quality among non-switching clients. Journal of Accounting and Public Policy 40(5): p.106785.

    Google Scholar 

  • Khalid, S., T. Khalil, and S. Nasreen. 2014. A survey of feature selection and feature extraction techniques in machine learning. 2014 Science and information conference. IEEE, 372–378.

    Google Scholar 

  • Kim, H.S., and S.Y. Sohn. 2010. Support vector machines for default prediction of SMEs based on technology credit. European Journal of Operational Research 201: 838–846.

    Google Scholar 

  • Kober, J., J.A. Bagnell, and J. Peters. 2013. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32: 1238–1274.

    Google Scholar 

  • Lecun, Y., Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521: 436–444.

    Google Scholar 

  • Lefkowitz, M. 2019. Professor’s perceptron paved the way for AI: 60 years too soon. Cornell Chronicle. https://news.cornell.edu/stories/2019/09/professors-perceptron-paved-way-ai-60-years-too-soon

  • Li, F. 2010. The information content of forward-looking statements in corporate filings – A naïve Bayesian machine learning approach. Journal of Accounting Research 48: 1049–1102.

    Google Scholar 

  • Odom, M.D., and R. Sharda. 1990. A neural network model for bankruptcy prediction. 1990 IJCNN International Joint Conference on neural networks. IEEE, 163–168.

    Google Scholar 

  • Ohlson, J.A. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research 18: 109–131.

    Google Scholar 

  • Olson, D.L., D. Delen, and Y. Meng. 2012. Comparative analysis of data mining methods for bankruptcy prediction. Decision Support Systems 52: 464–473.

    Google Scholar 

  • Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, and V. Dubourg. 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12: 2825–2830.

    Google Scholar 

  • Perols, J. 2011. Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing 30 (2): 19–50.

    Google Scholar 

  • Perols, J.L., R.M. Bowen, C. Zimmermann, and B. Samba. 2017. Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review 92 (2): 221–245.

    Google Scholar 

  • Platt, H.D., M.B. Platt, and J.G. Pedersen. 1994. Bankruptcy discrimination with real variables. Journal of Business Finance & Accounting 21: 491–510.

    Google Scholar 

  • Provalis Research. 2017. Blogs on Text Analytics: A Brief History of Machine Learning. https://provalisresearch.com/blog/brief-historymachine-learning/.

  • Purda, L., and D. Skillicorn. 2015. Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research 32: 1193–1223.

    Google Scholar 

  • Rosenblatt, F. 1957. The perceptron: A perceiving and recognizing automation (Project Para). https://blogs.umass.edu/brain-wars/files/2016/03/rosenblatt-1957.pdf

  • Roth, Alvin E., ed. 1988. The Shapley value: Essays in honor of Lloyd S. Shapley. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511528446. ISBN 0-521-36177-X.

    Book  Google Scholar 

  • Sallab, A.E., M. Abdou, E. Perot, and S. Yogamani. 2017. Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017: 70–76.

    Google Scholar 

  • Shalev-Shwartz, S., and S. Ben-David. 2014. Understanding machine learning: From theory to algorithms. Cambridge: Cambridge University Press.

    Google Scholar 

  • Shaw, R. 2017. Top 10 machine learning algorithms for beginners [Online]. KDnuggets. Available: https://www.kdnuggets.com/2017/10/top-10-machine-learning-algorithms-beginners.html. Accessed.

  • Shin, K.-S., T.S. Lee, and H.-J. Kim. 2005. An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications 28: 127–135.

    Google Scholar 

  • Sidhu, H. 2019. How audit digitization reflects a transformative age. Available at: https://www.ey.com/en_gl/digital-audit/auditdigitization-transformative-age

  • Sun, T. 2019. Applying deep learning to audit procedures: An illustrative framework. Accounting Horizons 33 (3): 89–109.

    Google Scholar 

  • Sutton, R.S., and A.G. Barto. 2018. Reinforcement learning: An introduction. Cambridge, MA: MIT Press.

    Google Scholar 

  • Tsai, C.-F., Y.-F. Hsu, and D.C. Yen. 2014. A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing 24: 977–984.

    Google Scholar 

  • Van Den Bogaerd, M., and W. Aerts. 2011. Applying machine learning in accounting research. Expert Systems with Applications 38: 13414–13424.

    Google Scholar 

  • Van Der Maaten, L., E. Postma, and J. Van Den Herik. 2009. Dimensionality reduction: A comparative. Journal of Machine Learning Research 10: 13.

    Google Scholar 

  • Wiederhold Gio, John McCarthy, and Ed Feigenbaum. 1990. “Memorial resolution: Arthur L. Samuel” (PDF). Stanford University Historical Society. Archived from the original (PDF) on 26 May 2011. Retrieved April 29, 2011. https://web.archive.org/web/20110526195107/http://histsoc.stanford.edu/pdfmem/SamuelA.pdf

  • Yang, Z., M.B. Platt, and H.D. Platt. 1999. Probabilistic neural networks in bankruptcy prediction. Journal of Business Research 44: 67–74.

    Google Scholar 

  • Yang, J.C., H.C. Chuang, and C.M. Kuan. 2020. Double machine learning with gradient boosting and its application to the Big N audit quality effect. Journal of Econometrics 216: 268–283.

    Google Scholar 

  • Zang, A.Y. 2012. Evidence on the trade-off between real activities manipulation and accrual-based earnings management. The Accounting Review 87 (2): 675–703.

    Google Scholar 

  • Zhao, Q., and S.S. Bhowmick. 2003. Association rule mining: A survey. Vol. 135. Singapore: Nanyang Technological University.

    Google Scholar 

  • Zhao, Y., Z. Nasrullah, and Z. Li. 2019. Pyod: A python toolbox for scalable outlier detection. ar**v preprint ar**v:1901.01588.

    Google Scholar 

  • Zhou, Z.-H. 2009. Ensemble learning. In Encyclopedia of biometrics, vol. 1, 270–273. New York: Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ting Sun .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Hu, H., Sun, T. (2021). The Applications of Machine Learning in Accounting and Auditing Research. In: Lee, CF., Lee, A.C. (eds) Encyclopedia of Finance. Springer, Cham. https://doi.org/10.1007/978-3-030-73443-5_91-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73443-5_91-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73443-5

  • Online ISBN: 978-3-030-73443-5

  • eBook Packages: Springer Reference Economics and FinanceReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences

Publish with us

Policies and ethics

Navigation