The Linear Model: t-test and ANOVA

Dormann, Carsten

doi:10.1007/978-3-030-55020-2_11

Carsten Dormann²

1515 Accesses

Abstract

At the end of this (long) chapter $\ldots $

$\ldots $:: you will know the t-test in its many different variations.
$\ldots $:: You will understand that the idea of variance analysis is to divide the total variance into explainable and unexplained variance.
$\ldots $:: you will know the F-test for calculating the significance of a variance analysis.
$\ldots $:: you will understand the close relationship between ANOVA and regression.

If you give people a linear model function you give them something dangerous.

—John Fox (fortunes(49))

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: EUR 29.95; Price includes VAT (Germany)

eBook: EUR 53.49; Price includes VAT (Germany)

Softcover Book: EUR 69.54; Price includes VAT (Germany)

Hardcover Book: EUR 106.99; Price includes VAT (Germany)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We will also shortly consider the case that $\sigma $ depends on $\mathbf {X}$, but this is generally not possible in the linear model. The notation f(.) indicates that non-linear functions could also be considered. But we’re not going to do that here.
2.
At this point it is obligatory to note that “Student” was the pseudonym of W. S. Gosset when he published the t-test while working for the Guinness Brewery. His employer considered quality assurance statistics as a trade secret. However, Gosset’s colleagues in mathematics knew the person behind pseudonym.
3.
The central limit theorem states that (most) parameter estimators are (asymptotically) normally distributed, even if the considered variables are not normally distributed. For example, if we estimate the median of a sample, this estimate has a certain error because we consider only one sample. The error of this median is normally distributed, although our sample might be crooked and skewed!
4.
That’s a mens size 6.5 for those of you in the UK and a men’s size 7 for you Yankees out there.
5.
In Libre/OpenOffice Calc with the function TDIST or in Microsoft Excel with the function TINV. The reasons why we should never rely on MS Excel for statistical calculations are given by McCullough and Heiser (2008).
6.
In fact, the normal distribution is also shown in the right figure, but is indistinguishable from the t-distribution with df = 500.
7.
The phrase “less significant” is rightly frowned upon by statisticians. If we have a threshold for significance, conventionally 0.05, then any value below it is significant, fullstop. “Less significant” is akin to “more pregnant”.
8.
See Sect. 11.3.2 on page 160 for a more detailed derivation. For the moment, we can think of it as a measure of how much effort we put into the calculation of means: the more classes, the more degrees of freedom we use. A more reasonable explanation unfortunately has to wait until we look at ANOVA and regression as two sides of the same coin.
9.
The squared correlation coefficient between observed data $\mathbf {y}$ and the model fit $\hat{\mathbf {y}}$ is exactly $R^2$. You will find both $r^2$ and $R^2$ in the literature. In the simple regression model, $R^2=r^2$ is used, but this is no longer the case with non-linear models. There, the $R^2$ loses its clear interpretability, since the null model is no longer necessarily a sub-model, and thus the comparison of the sum of squares is meaningless.
10.
By the way, there are different ways to calculate the degrees of freedom. It is typically assumed, for example, that all groups contain the same number of data points (so-called balanced design), which unfortunately is rarely the case in the real world. We therefore use a generally valid equation here, which will also be useful to us later on, if we want to combine ANOVA and regression.
11.
This is especially important when considering comparisons with other literature. In other textbooks, ANOVA is often only suggested for use with categorical predictors. We see here that this line of thought is a bit narrow-minded.
12.
Well this actually shouldn’t come as a surprise. We have been using the F-value throughout this chapter to test whether the predictor has a significant influence on the variances.
13.
So we first add to all values so that the smallest value is 0. Then we look at the amount of the next-biggest value one and add half of that to all values. If possible it is better to not use the ANOVA but to stay with a GLM. More on this later (Sect. 11.4).
14.
For only positive y-values (so $y > 0$), the Yeo-Johnson-transformation presented here is identical to the Box-Cox-transformation. However, the Box-Cox-transformation only works for positive y-values (and shifts the values if necessary), while the Yeo-Johnson-transformation can also appropriately transform negative values without shifting. In their original work, the authors also show that their transformation often gets closer to the normal distribution, and is never worse than the Box-Cox (Yeo and Johnson 2000). Here is the original (two-parameter) Box-Cox transformation (Box and Cox 1964):
$$ y' = {\left\{ \begin{array}{ll} ((y+c)^\lambda -1)/\lambda , &{} \text { if } \lambda \ne 0, \\ \log (y+c), &{} \text { if } \lambda = 0. \\ \end{array}\right. } $$
The parameters $\lambda $ and c (only if y also contains non-positive values) are calculated using the log-likelihood (i.e. adapted to a normal distribution). Since we will not deal with this transformation further (it’s too old school), here are the relevant R-packages: bcPower and yjPower in car; yeo.johnson in VGAM; boxcox in MASS.
15.
For example: Duncan’s new multiple range test, Dunnett test, Friedman test(non-parametric, therefore usable for the Kruskal-Wallis-test), the Scheffé method, Holm-correction, false discovery rate-correction.
In some of these tests (such as the Newman-Keuls test), the comparisons are first sorted by the difference in mean values and then tested one after another. As soon as a difference is no longer significant, we can abort the test, since the differences thereafter are even smaller (and the variance is the same everywhere, an assumption of ANOVA). Thus, we manage to get by with fewer comparisons, which leads to a less conservative statement than the Bonferroni correction.
With the frequently used Holm correction, all comparisons are made, but then the P values are sorted and the first comparison is corrected just like the Bonferroni, but the second one is only multiplied by $k-1$, the third one is multiplied by $k-2$, and so on. This makes the Holm less conservative than Bonferroni correction.
16.
Analysis of whether the units on the left side of the equation are identical to those on the right.
17.
The deviance is computed (for all practical purposes) as $-2\ell $. (Actually there is another term, the log-likelihood of the so-called saturated model, that comes in here, but in the context of the GLM this is largely immaterial.) Thus, the ratio of likelihoods is equivalent to the difference of log-likelihoods, which is why the $\chi ^2$-test is computed on their differences.
18.
On the homepage of statistics professor Frank Harrell (Vanderbilt University, Nashville, Tennessee) this tip is listed under Philosophy of Biostatistics as the third point. The first two are also noteworthy: http://biostat.mc.vanderbilt.edu/wiki/Main/FrankHarrell.
19.
A possible reason is that the context is not linear and we should insert a square term: point 4 on Harrell’s list.

References

Box, G. E. P., & Cox, D. R. (1964). The analysis of transformations. Journal of the Royal Statistical Society B, 26, 211–252.
MathSciNet MATH Google Scholar
Crawley, M. J. (2002). Statistical computing. An introduction to data analysis using S-Plus. Chichester: Wiley.
MATH Google Scholar
Crawley, M. J. (2007). The R Book. Chichester, UK: Wiley.
Book Google Scholar
Dalgaard, P. (2002). Introductory statistics with R. Berlin: S**er.
MATH Google Scholar
Day, R. W., & Quinn, G. P. (1989). Comparisons of treatments after an analysis of variance in ecology. Ecological Monographs, 59, 433–4636.
Article Google Scholar
Fisher, R. A. (1918). The correlation between relatives on the supposition of Mendelian inheritance. Philosophical Transactions of the Royal Society of Edinburgh, 52, 399–433.
Article Google Scholar
Fisher, R. A. (1925). Statistical methods for research workers. Edinburgh: Oliver & Boyd.
MATH Google Scholar
Mann, H. B. (1949). Analysis and design of experiments: Analysis of variance and analysis of variance designs. New York: Dover Publications.
Google Scholar
McCullough, B. D., & Heiser, D. A. (2008). On the accuracy of statistical procedures in Microsoft Excel 2007. Computational Statistics & Data Analysis, 52, 4570–4578.
Article MathSciNet Google Scholar
O’Hara, R. B., & Kotze, D. J. (2010). Do not log-transform count data. Methods in Ecology and Evolution, 1(2), 118–122.
Article Google Scholar
Quinn, G. P., & Keough, M. J. (2002). Experimental design and data analysis for biologists. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Underwood, A. J. (1997). Experiments in ecology: Their logical design and interpretation using analysis of variance. Cambridge, UK: Cambridge University Press.
Google Scholar
Warton, D. I., & Hui, F. K. C. (2011). The arcsine is asinine: The analysis of proportions in ecology. Ecology, 92, 3–10.
Article Google Scholar
Withers, C. S., & Nadarajah, S. (2014). Simple alternatives for Box-Cox transformations. Metrika, 77(2), 297–315.
Article MathSciNet Google Scholar
Yeo, I.-K., & Johnson, R. A. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954–959.
Article MathSciNet Google Scholar
Zar, J. H. (2013). Biostatistical analysis (5th ed.). Pearson.
Google Scholar
Zuur, A. F., Ieno, E. N., Walker, N. J., Saveliev, A. A., & Smith, G. M. (2009). Mixed effects models and extensions in ecology with R. Berlin: Springer.
Book Google Scholar

Download references

Author information

Authors and Affiliations

Biometry and Environmental System Analysis, University of Freiburg, Freiburg, Germany
Carsten Dormann

Authors

Carsten Dormann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carsten Dormann .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dormann, C. (2020). The Linear Model: t-test and ANOVA. In: Environmental Data Analysis. Springer, Cham. https://doi.org/10.1007/978-3-030-55020-2_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-55020-2_11
Published: 21 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-55019-6
Online ISBN: 978-3-030-55020-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics