Abstract
The term “machine learning” has become a buzzword in the past few years. In accounting and auditing area, while this technology has been used in major accounting firms such as Big 4 s, its research is still evolving. Increased use of machine learning and other artificial intelligence techniques will allow accountants to focus on providing better decision support instead of on data gathering and manual analyses. This entry introduces machine learning as compared to traditional statistical modeling, discusses its current applications in accounting and auditing research, and provides directions for future research.
Similar content being viewed by others
Notes
- 1.
The data source is https://www.kaggle.com/dgomonov/new-york-city-airbnb-open-data. In this application case, explanatory variables including room types, geographical availability, and the number of reviews per month are treated as the independent variables to predict prices.
- 2.
The data source is http://yann.lecun.com/exdb/mnist/
- 3.
One of the examples in this domain is in https://www.kaggle.com/vjchoudhary7/customer-segmentation-tutorial-in-python. The task of this data set is to segment customers based on their behavioral related attributes by applying the K-means clustering algorithms.
- 4.
A bootstrap replicate is accessed by randomly sampling the training set with replacement. This operation will generate a new training set with size equal to that of the original one.
- 5.
The statistical problem means that the training set fails to provide adequate information to select one single learner within the circumstance that multiple unique learners can achieve the same accuracy on the training set.
- 6.
Searching for the best hypothesis (e.g., neutral network) that fits in the training data may be computationally intractable.
- 7.
The approximations to the real target function, which are generated from single learners, may not be ideal.
- 8.
“Complexity” means that the decision tree model generates a plethora of rules, resulting the overfitting issues.
- 9.
For details of the cross-validation method, check the website: https://scikit-learn.org/stable/modules/cross_validation.html
- 10.
More details are presented on the website: https://www.tensorflow.org/
- 11.
Shapley, Lloyd S. (August 21, 1951). “Notes on the n-Person Game -- II: The Value of an n-Person Game” (PDF). Santa Monica, Calif.: RAND Corporation.
- 12.
Read “Cooperative game theory assumes that groups of players, called coalitions, are the primary units of decision-making, and may enforce cooperative behavior.” (Choudhary 2019). https://www.analyticsvidhya.com/blog/2019/11/shapley-value-machine-learning-interpretability-game-theory/
- 13.
Local surrogate models are interpretable models that are used to explain individual predictions of black box machine learning models.
References
Adadi, A., and M. Berrada. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6: 52138–52160.
Agrawal, R., T. Imieliński, and A. Swami. 1993. Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp 207–216.
Alpaydin, E. 2020. Introduction to machine learning. Cham: MIT Press.
Altman, E.I. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance 23: 589–609.
Analyticsai, C. 2020. AnalyticsAI for every engagement [Online]. Available: https://www.caseware.com/us/analyticsai. Accessed.
Anand, V., R. Brunner, K. Ikegwu, and T. Sougiannis. 2019. Predicting profitability using machine learning. Available at SSRN 3466478.
Anthony, M., and P.L. Bartlett. 2009. Neural network learning: Theoretical foundations. Cambridge: Cambridge University Press.
Apley, D.W. 2016. Visualizing the effects of predictor variables in black box supervised learning models. ar**v preprint ar**v:1612.08468.
Apley, D.W., and J. Zhu. 2020. Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society, Series B: Statistical Methodology 82 (4): 1059–1086.
Bao, Y., and A. Datta. 2014. Simultaneously discovering and quantifying risk types from textual risk disclosures. Management Science 60: 1371–1391.
Bao, Y., B. Ke, B. Li, Y.J. Yu, and J. Zhang. 2020. Detecting accounting fraud in publicly traded US firms using a machine learning approach. Journal of Accounting Research 58 (1): 199–235.
Barboza, F., H. Kimura, and E. Altman. 2017. Machine learning models and bankruptcy prediction. Expert Systems with Applications 83: 405–417.
Barth, Mary E. and Li, Ken and McClure, Charles. 2021. Evolution in Value Relevance of Accounting Information. Stanford University Graduate School of Business Research Paper No. 17-24, Available at SSRN: https://ssrn.com/abstract=2933197 or https://doi.org/10.2139/ssrn.2933197.
Beneish, M.D. 1999. The detection of earnings manipulation. Financial Analysts Journal 55: 24–36.
Bertomeu, J. 2020. Machine learning improves accounting: Discussion, implementation and research opportunities. Review of Accounting Studies 25: 1135–1155.
Bertomeu, J., E. Cheynel, E. Floyd, and W. Pan. 2020. Using machine learning to detect misstatements. Review of Accounting Studies 26: 1–52.
Bishop, C.M. 2006. Pattern recognition and machine learning. Springer.
Breiman, L. 1996. Bagging predictors. Machine Learning 24: 123–140.
Brown, N.C., R.M. Crowley, and W.B. Elliott. 2020. What are you saying? Using topic to detect financial misreporting. Journal of Accounting Research 58 (1): 237–291.
Brown-Liburd, H., A. Cheong, M.A. Vasarhelyi, and X. Wang. 2019. Measuring with exogenous data (MED), and government economic monitoring (GEM). Journal of Emerging Technologies in Accounting. 16 (1): 1–19.
Bzdok, D., N. Altman, and M. Krzywinski. 2018. Points of significance: Statistics versus machine learning. Nature Methods 15 (4): 233–234. https://www.nature.com/articles/nmeth.4642.pdf?origin=ppub.
Carton, R.B., and C.W. Hofer. 2006. Measuring organizational performance: Metrics for entrepreneurship and strategic management research. Edward Elgar Publishing.
Cecchini, M., H. Aytug, G.J. Koehler, and P. Pathak. 2010a. Making words work: Using financial text as a predictor of financial events. Decision Support Systems 50: 164–175.
Cecchini, A., et al. 2010b. Detecting management fraud in public companies. Management Science 56 (7): 1146–1160. https://doi.org/10.1287/mnsc.1100.1174.
Chen, M.-S., J. Han, and P.S. Yu. 1996. Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering 8: 866–883.
Cho, S., M.A. Vasarhelyi, T. Sun, and C. Zhang. 2020. Learning from machine learning in accounting and assurance. Journal of Emerging Technologies in Accounting.
Chollet, F. 2017. Deep learning with python. Shelter Island: Manning Publications Company.
Choudhary, A. 2019. Analytics vidhya. A unique method for machine learning interpretability: Game theory & shapley values. https://www.analyticsvidhya.com/blog/2019/11/shapley-value-machine-learning-interpretability-game-theory/.
Dechow, P.M., and I.D. Dichev. 2002. The quality of accruals and earnings: The role of accrual estimation errors. The Accounting Review 77: 35–59.
Dechow, P.M., W. Ge, C.R. Larson, and R.G. Sloan. 2011. Predicting material accounting misstatements. Contemporary Accounting Research 28 (1): 17–82.
Dietterich, T.G. 2002. Ensemble learning. In The handbook of brain theory and neural networks, vol. 2, 110–125. Cambridge, MA: MIT Press.
Ding, K., B. Lev, X. Peng, T. Sun, and M.A. Vasarhelyi. 2020. Machine learning improves accounting estimates: Evidence from insurance payments. Available at SSRN 3253220.
Expert.ai. 2020. What is machine learning? A definition. https://www.expert.ai/blog/machine-learning-definition/
Foote, K.D. 2019. A brief history of machine learning. Data Topics. Dataversity. https://www.dataversity.net/a-brief-history-of-machine-learning/
Frankel, R., J. Jennings, and J. Lee. 2016. Using unstructured and qualitative disclosures to explain accruals. Journal of Accounting and Economics 62: 209–227.
Ghahramani, Z. 2015. Probabilistic machine learning and artificial intelligence. Nature 521 (7553): 452–459. https://www.repository.cam.ac.uk/bitstream/handle/1810/248538/Ghahramani%202015%20Nature.pdf;jsessionid=3DB2D31FFA80196A97AEEBECB06FEF42?sequence=1.
Goel, S., J. Gangolly, S.R. Faerman, and O. Uzuner. 2010. Can linguistic predictors detect fraudulent financial filings. Journal of Emerging Technologies in Accounting. 7: 25–46.
Hammond, K. 2016. 5 unexpected sources of bias in artificial intelligence. Available at: https://techcrunch.com/2016/12/10/5-unexpected-sources-of-bias-in-artificial-intelligence/
Healthcare.ai. 2020. Machine learning versus statistics: When to use each. Data Science Blog. https://healthcare.ai/machine-learning-versus-statistics-use/
Hebb, D.O. 1949. The organization of behavior: A neuropsychological theory. New York, London: J. Wiley, Chapman & Hall. http://s-f-walker.org.uk/pubsebooks/pdfs/The_Organization_of_Behavior-Donald_O._Hebb.pdf.
Heller, M. 2019. Machine learning algorithms explained [Online]. Available: https://www.infoworld.com/article/3394399/machine-learning-algorithms-explained.html. Accessed.
Huang, X.S., and L. Sun. 2017. Managerial ability and real earnings management. Advances in Accounting 39: 91–104.
Huang, A.H., A.Y. Zang, and R. Zheng. 2014. Evidence on the information content of text in analyst reports. The Accounting Review 89: 2151–2180.
Hu, H., T. Sun, M.A. Vasarhelyi, and M. Zhang. 2020. A Machine Learning Approach of Measuring Audit Quality: Evidence From China. Available at SSRN 3732563.
Huang, A.H., R. Lehavy, A.Y. Zang, and R. Zheng. 2018. Analyst information discovery and interpretation roles: A topic modeling approach. Management Science 64: 2833–2855.
Hunt, J.O., D.M. Rosser, and S.P. Rowe. 2021. Using machine learning to predict auditor switches: How the likelihood of switching affects audit quality among non-switching clients. Journal of Accounting and Public Policy 40(5): p.106785.
Khalid, S., T. Khalil, and S. Nasreen. 2014. A survey of feature selection and feature extraction techniques in machine learning. 2014 Science and information conference. IEEE, 372–378.
Kim, H.S., and S.Y. Sohn. 2010. Support vector machines for default prediction of SMEs based on technology credit. European Journal of Operational Research 201: 838–846.
Kober, J., J.A. Bagnell, and J. Peters. 2013. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research 32: 1238–1274.
Lecun, Y., Y. Bengio, and G. Hinton. 2015. Deep learning. Nature 521: 436–444.
Lefkowitz, M. 2019. Professor’s perceptron paved the way for AI: 60 years too soon. Cornell Chronicle. https://news.cornell.edu/stories/2019/09/professors-perceptron-paved-way-ai-60-years-too-soon
Li, F. 2010. The information content of forward-looking statements in corporate filings – A naïve Bayesian machine learning approach. Journal of Accounting Research 48: 1049–1102.
Odom, M.D., and R. Sharda. 1990. A neural network model for bankruptcy prediction. 1990 IJCNN International Joint Conference on neural networks. IEEE, 163–168.
Ohlson, J.A. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research 18: 109–131.
Olson, D.L., D. Delen, and Y. Meng. 2012. Comparative analysis of data mining methods for bankruptcy prediction. Decision Support Systems 52: 464–473.
Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, and V. Dubourg. 2011. Scikit-learn: Machine learning in python. The Journal of Machine Learning Research 12: 2825–2830.
Perols, J. 2011. Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing 30 (2): 19–50.
Perols, J.L., R.M. Bowen, C. Zimmermann, and B. Samba. 2017. Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review 92 (2): 221–245.
Platt, H.D., M.B. Platt, and J.G. Pedersen. 1994. Bankruptcy discrimination with real variables. Journal of Business Finance & Accounting 21: 491–510.
Provalis Research. 2017. Blogs on Text Analytics: A Brief History of Machine Learning. https://provalisresearch.com/blog/brief-historymachine-learning/.
Purda, L., and D. Skillicorn. 2015. Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary Accounting Research 32: 1193–1223.
Rosenblatt, F. 1957. The perceptron: A perceiving and recognizing automation (Project Para). https://blogs.umass.edu/brain-wars/files/2016/03/rosenblatt-1957.pdf
Roth, Alvin E., ed. 1988. The Shapley value: Essays in honor of Lloyd S. Shapley. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511528446. ISBN 0-521-36177-X.
Sallab, A.E., M. Abdou, E. Perot, and S. Yogamani. 2017. Deep reinforcement learning framework for autonomous driving. Electronic Imaging 2017: 70–76.
Shalev-Shwartz, S., and S. Ben-David. 2014. Understanding machine learning: From theory to algorithms. Cambridge: Cambridge University Press.
Shaw, R. 2017. Top 10 machine learning algorithms for beginners [Online]. KDnuggets. Available: https://www.kdnuggets.com/2017/10/top-10-machine-learning-algorithms-beginners.html. Accessed.
Shin, K.-S., T.S. Lee, and H.-J. Kim. 2005. An application of support vector machines in bankruptcy prediction model. Expert Systems with Applications 28: 127–135.
Sidhu, H. 2019. How audit digitization reflects a transformative age. Available at: https://www.ey.com/en_gl/digital-audit/auditdigitization-transformative-age
Sun, T. 2019. Applying deep learning to audit procedures: An illustrative framework. Accounting Horizons 33 (3): 89–109.
Sutton, R.S., and A.G. Barto. 2018. Reinforcement learning: An introduction. Cambridge, MA: MIT Press.
Tsai, C.-F., Y.-F. Hsu, and D.C. Yen. 2014. A comparative study of classifier ensembles for bankruptcy prediction. Applied Soft Computing 24: 977–984.
Van Den Bogaerd, M., and W. Aerts. 2011. Applying machine learning in accounting research. Expert Systems with Applications 38: 13414–13424.
Van Der Maaten, L., E. Postma, and J. Van Den Herik. 2009. Dimensionality reduction: A comparative. Journal of Machine Learning Research 10: 13.
Wiederhold Gio, John McCarthy, and Ed Feigenbaum. 1990. “Memorial resolution: Arthur L. Samuel” (PDF). Stanford University Historical Society. Archived from the original (PDF) on 26 May 2011. Retrieved April 29, 2011. https://web.archive.org/web/20110526195107/http://histsoc.stanford.edu/pdfmem/SamuelA.pdf
Yang, Z., M.B. Platt, and H.D. Platt. 1999. Probabilistic neural networks in bankruptcy prediction. Journal of Business Research 44: 67–74.
Yang, J.C., H.C. Chuang, and C.M. Kuan. 2020. Double machine learning with gradient boosting and its application to the Big N audit quality effect. Journal of Econometrics 216: 268–283.
Zang, A.Y. 2012. Evidence on the trade-off between real activities manipulation and accrual-based earnings management. The Accounting Review 87 (2): 675–703.
Zhao, Q., and S.S. Bhowmick. 2003. Association rule mining: A survey. Vol. 135. Singapore: Nanyang Technological University.
Zhao, Y., Z. Nasrullah, and Z. Li. 2019. Pyod: A python toolbox for scalable outlier detection. ar**v preprint ar**v:1901.01588.
Zhou, Z.-H. 2009. Ensemble learning. In Encyclopedia of biometrics, vol. 1, 270–273. New York: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this entry
Cite this entry
Hu, H., Sun, T. (2021). The Applications of Machine Learning in Accounting and Auditing Research. In: Lee, CF., Lee, A.C. (eds) Encyclopedia of Finance. Springer, Cham. https://doi.org/10.1007/978-3-030-73443-5_91-1
Download citation
DOI: https://doi.org/10.1007/978-3-030-73443-5_91-1
Received:
Accepted:
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73443-5
Online ISBN: 978-3-030-73443-5
eBook Packages: Springer Reference Economics and FinanceReference Module Humanities and Social SciencesReference Module Business, Economics and Social Sciences