Conditions for ANOVA

1. Conditions for ANOVA

Just like any other statistical inference method we've encounter so far, there are conditions that need to be met for anova as well.

2. Conditions for ANOVA

There are three main conditions for ANOVA. The first one is independence. Within groups the sampled observations must be independent of each other, and between groups the groups must be independent of each other as well We also need approximate normality, that is the distributions within each group should be nearly normal and constant variance, that is the variability of the distributions of the response variable within each group should have roughly equal variance. Next we'll discuss each condition in more detail.

3. Independence

Let's start with the independence condition. Within groups we want the samples observations to be independent, which we can assume to the case if we have random sampling or assignment, and if each sample size is less than 10% of its respective population, if we have conducted a stratified sampling process without replacement. This condition is always important, but can be difficult to check if we don't have sufficient information on how the study was designed and data were collected. Between groups we want the groups to be independent of each other. This requires carefully considering whether there is a paired structure between the groups. if the answer is yes, this is not the end of the world, but it requires a different, slightly more advanced version of ANOVA, called repeated measures ANOVA, for a correct analysis of such data. So the ANOVA we are learning in this course will only work in circumstances where the groups are independent.

4. Approximately normal

We also need the distribution of the response variable within to be approximately normal. And this condition is especially important when the sample sizes are small. We can check this condition using appropriate visualizations, which you'll get to do in the following exercises.

5. Constant variance

Lastly we need constant variance across groups, in other words variability should be consistent across groups. A commonly used term for this is homoscedasticity. This condition is especially important when the sample sizes differ between groups. We can use visualizations and/or summary statistics to check this condition.

6. Let's practice!

Next we'll check the conditions for the vocabulary score vs. social class ANOVA that we have been working on.