Abstract
Statistics is a science that is concerned with principles, methods, and techniques for collecting, processing, analyzing, presenting, and interpreting (numerical) data. Statistics can be divided roughly into descriptive statistics (Chap. 1) and inferential statistics (Chap. 2), as we have already suggested. Descriptive statistics summarizes and visualizes the observed data. It is usually not very difficult, but it forms an essential part of reporting (scientific) results. Inferential statistics tries to draw conclusions from the data that would hold true for part or the whole of the population from which the data is collected. The theory of probability, which is the topic of the next two theoretical chapters, makes it possible to connect the two disciplines of descriptive and inferential statistics. We have already encountered some ideas from probability theory in the previous chapter. To start with, we discussed the probability of selecting a specific sample \(\pi _k\) and we briefly defined the notion of probability based on the throwing of a dice. In this chapter we work out these ideas more formally and discuss the probabilities of events; we define probabilities and discuss how to calculate with probabilities. In the previous chapter, when discussing bias, we have also encountered the expected population parameter \(\mathbb {E}(T)\), but we have not yet detailed what expectations are exactly; this is something we cover in Chap. 4.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
It should be noted here that a probability of zero does not necessarily mean that the event will never occur. This seems contradictory, but we will explain this later. On the other hand, if the event can never occur, the probability is zero.
- 2.
- 3.
If, in this case, the population size(s) were known, we could calculate weighted averages to estimate the population parameters as we did in Chap. 2.
- 4.
Note that Simpson’s Paradox, and its solutions, are still heavily debated (see, Armistead 2014 for examples).
References
T.W. Armistead, Resurrecting the third variable: a critique of pearl’s causal analysis of Simpson’s paradox. Am. Stat. 68(1), 1–7 (2014)
C.R. Charig, D.R. Webb, S.R. Payne, J.E. Wickham, Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. Br. Med. J. (Clin. Res. Ed.) 292(6524), 879–882 (1986)
G. Grimmett, D. Stirzaker et al., Probability and Random Processes (Oxford University Press, Oxford, 2001)
N.P. Jewell, Statistics for Epidemiology (Chapman and Hall/CRC, Boca Raton, 2003)
R. Lanting, E.R. Van Den Heuvel, B. Westerink, P.M. Werker, Prevalence of dupuytren disease in the Netherlands. Plast. Reconstr. Surg. 132(2), 394–403 (2013)
K.J. Rothman, S. Greenland, T.L. Lash et al., Modern Epidemiology, vol. 3 (Wolters Kluwer Health/Lippincott Williams & Wilkins, Philadelphia, 2008)
E.H. Simpson, The interpretation of interaction in contingency tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 13(2), 238–241 (1951)
E.P. Veening, R.O.B. Gans, J.B.M. Kuks, Medische Consultvoering (Bohn Stafleu van Loghum, Houten, 2009)
E. White, B.K. Armstrong, R. Saracci, Principles of Exposure Measurement in Epidemiology: Collecting, Evaluating and Improving Measures of Disease Risk Factors (OUP, Oxford, 2008)
F.N. David, Studies in the History of Probability and Statistics I. Dicing and Gaming (A Note on the History of Probability). Biometrika, 42(1/2), 1–5 (1955)
O.B. Sheynin, Early history of the theory of probability. Archive for History of Exact Sciences, 17(3), 201–259 (1977)
S.M. Stigler, Studies in the History of Probability and Statistics. XXXIV: Napoleonic statistics: The work of Laplace. Biometrika, 62(2), 503–517 (1975)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kaptein, M., van den Heuvel, E. (2022). Probability Theory. In: Statistics for Data Scientists . Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-10531-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-10531-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10530-3
Online ISBN: 978-3-030-10531-0
eBook Packages: Computer ScienceComputer Science (R0)