Abstract
In the recent years, sports outcome prediction has gained popularity, as demonstrated by massive financial transactions in sports betting. One of the world’s popular sports that lures betting and attracts millions of fans worldwide is basketball, particularly the National Basketball Association (NBA) of the United States. This paper proposes a new intelligent machine learning framework for predicting the results of games played at the NBA by aiming to discover the influential features set that affects the outcomes of NBA games. We would like to identify whether machine learning methods are applicable to forecasting the outcome of an NBA game using historical data (previous games played), and what are the significant factors that affect the outcome of games. To achieve the objectives, several machine learning methods that utilise different learning schemes to derive the models, including Naïve Bayes, artificial neural network, and Decision Tree, are selected. By comparing the performance and the models derived against different features sets related to basketball games, we can discover the key features that contribute to better performance such as accuracy and efficiency of the prediction model. Based on the results analysis, the DRB (defensive rebounds) feature was chosen and was deemed as the most significant factor influencing the results of an NBA game. Furthermore, others crucial factors such as TPP (three-point percentage), FT (free throws made), and TRB (total rebounds) were also selected, which subsequently increased the model’s prediction accuracy rate by 2–4%.
Similar content being viewed by others
References
Abdelhamid N, Thabtah F, Abdel-jaber H (2017) Phishing detection: a recent intelligent machine learning comparison based on models content and features. In: Proceedings of the 2017 IEEE international conference on intelligence and security informatics (ISI). Bei**g
AlShboul R, Thabtah F, Abdelhamid N, Al-diabat M (2018) A visualization cybersecurity method based on features’ dissimilarity. Comput Secur 77:289–303
Bradly M (2016) ABC News. https://www.abc.net.au/news/2016-01-21/bradley-corruption-inprofessional-sport-should-be-no-surprise/7101508. Accessed 18 Jan 2018
Bunker RP, Thabtah F (2017) A machine learning framework for sport result prediction. Appl Comput Inform. https://doi.org/10.1016/j.aci.2017.09.005
Burges C (1998) Tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2:121–167
Cao C (2012) Sports data mining technology used in basketball outcome prediction. Dublin Institute of Technology. Retrieved from https://arrow.dit.ie/cgi/viewcontent.cgi?article=1040&context=scschcomdis. Accessed 17 Jan 2018
Cheng G, Zhang Z, Kyebambe MN, Kimbugwe N (2016) Predicting the outcome of NBA playoffs based on the maximum entropy principle. Entropy 18:450. https://doi.org/10.3390/e18120450
Cohen W (1995) Fast effective rule induction. Proceedings of the 12th International Conference on Machine Learning 115–123
Haghighat M, Rastegari H, Nourafza N (2013) A review of data mining techniques for result prediction in sports. In: Advances in computer science, pp 2322–5157
Hall M (1999) Correlation-based feature selection for machine learning. Doctoral dissertation, University of Waikato, Dept. of Computer Science
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA Data Mining Software: An Update. SIGKDD Explor 11(1)
Higgins J (2005) Introduction to multiple regression, Chapt 4, pp 111–115. Accessed 9 Feb 2018
Hosmer D, Lemeshow S (2000) Applied logistic regression. Wiley, New York, pp 236–269
Kaggle Inc (2018) Kaggle: your home for data science. Retrieved 24 July 2018, from https://www.kaggle.com/slonsky/boxing-bouts
Keller JM, Gray MR, Givens JA (1985) A fuzzy K-nearest neighbour algorithm. IEEE Trans Syst Man Cyberne 580(4):580–585
Kopf D (2017) Data analytics have made the NBA unrecognizable. Retrieved from: https://qz.com/1104922/data-analytics-have-revolutionized-the-nba/. Accessed 25 Feb 2018
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 95(1–2):161–205
Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers. In: The tenth national conference on artificial intelligence, vol. 24. AAAI Press, San Jose, pp 399–406
Latheef NA (2017) The number games—how machine learning is changing sports. Retrieved from https://medium.com/@nabil_lathif/the-number-games-how-machine-learning-is-changing-sports-4f4673792c8e
Lewis D (1998) Naive (Bayes) at forty: the independence assumption in information retrieval. In: European conference on machine learning, pp 4–15
Lieder NM (2018) Can machine-learning methods predict the outcome of an NBA game? 1, Mar 2018. https://ssrn.com/abstract=3208101 or http://dx.doi.org/10.2139/ssrn.3208101
Loeffelholz B, Bednar E, Bauer KW (2009) Predicting NBA games using neural networks. J Quant Anal Sports 5(1):1156
Mccabe A, Trevathan J (2008) Artificial intelligence in sports prediction. In: Fifth international conference on information technology: new generations (itng 2008). https://doi.org/10.1109/itng.2008.203
Meyera D, Leischa F, Hornik K (2003) The support vector machine under test. Neurocomputing 55:169–186
Miljkovic D, Gajic L, Kovacevic A, Konjovic Z (2010) The use of data mining for basketball matches outcomes prediction. In: IEEE 8th international symposium on intelligent systems and informatics. SISY, Subotica, pp 10–11
Purucker M (1996) Neural network quarterbacking. IEEE Potentials 15(3):9–15. https://doi.org/10.1109/45.535226
Quinlan JR (1986) Induction of decision trees. Mach Learn. https://doi.org/10.1007/bf00116251
Schalkoff RJ (1997) Artificial neural networks. International ed. McGraw-Hill, New York
Steinberg L (2015) Changing the game: the rise of sports analytics. Retrieved from https://www.forbes.com/sites/leighsteinberg/2015/08/18/changing-the-game-the-rise-of-sports-analytics/. Accessed 15 Feb 2018
Thabtah F (2017) Autism spectrum disorder screening: machine learning adaptation and DSM-5 fulfillment. In: Proceedings of the 1st international conference on medical and health informatics. ACM, Taichung City, pp 1–6
Thabtah F, Abdelhamid N (2016) Deriving correlated sets of website features for phishing detection: a computational intelligence approach. J Inform Knowl Manag 15(04):1650042
Thabtah F, Kamalov F, Rajab K (2018) A new computational intelligence approach to detect autistic features for autism screening. Int J Med Inform 117:112–124
Trawinski K (2010) A fuzzy classification system for prediction of the results of the basketball games. In: IEEE international conference on fuzzy systems. Barcelona, pp 1–7. https://doi.org/10.1109/fuzzy.2010.5584399
Zdravevski E, Kulakov A (2009) System for prediction of the winner in a sports game. ICT Innov. https://doi.org/10.1007/978-3-642-10781-8_7
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thabtah, F., Zhang, L. & Abdelhamid, N. NBA Game Result Prediction Using Feature Analysis and Machine Learning. Ann. Data. Sci. 6, 103–116 (2019). https://doi.org/10.1007/s40745-018-00189-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40745-018-00189-x