Abstract
When you want to make quick predictions on a high-dimensional dataset, you use Naive Bayes. This is one of the most efficient algorithms for classification and probably the simplest. When you have several thousand data points and many features in your dataset, it trains quickly to help you get predictions in real time. It thus helps in building the fast machine learning models to make quick predictions. It is also easy to build. The algorithm is based upon Bayes’ theorem. I have presented a detailed computation on how to compute the various probability terms of Bayes’ law, so that you understand how the algorithm works. I also discuss the various advantages and disadvantages of the algorithm where it can be applied and where it cannot be. Finally, I will give you a few techniques for improving its performance. The sklearn library provides several implementations of the algorithm based on the Naive Bayes types—such as Multinomial, Bernoulli, and so on. You will learn these various types. Last, I discuss how to fit the model on huge datasets, followed by a complete classification example on a large text corpus.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Sarang, P. (2023). Naive Bayes. In: Thinking Data Science. The Springer Series in Applied Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-031-02363-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-02363-7_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-02362-0
Online ISBN: 978-3-031-02363-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)