Lecture 10: ANOVA

Published

June 30, 2025

We want to emphasize that nearly any statistical method with a closed-form solution can be replicated with a simulation.

Take ANOVA, or analysis of variance, for example. The ANOVA test is used to determine whether there are any statistically significant differences between the means of two or more groups. It is a parametric test that assumes the data is normally distributed and that the variances of the groups are equal.

The exact test we compute is the F-test, which compares the variance between groups to the variance within groups. The null hypothesis is that all group means are equal, while the alternative hypothesis is that at least one group mean is different. The F-test relies on the fact that the ratio of two independent chi-squared variables (i.e. a normal variable, squared) follows an F-distribution.

Here we will simulate the F-test by generating random data from a normal distribution and computing the F-statistic for different group means. We will see that under the null hypothesis (group means are equal), the F-statistic follows an F-distribution with degrees of freedom equal to the number of groups minus one and the total number of observations minus the number of groups.

In the next lecture we will compare the F-test to a non-parametric permutation test, which does not make any assumptions about the distribution of the data.