Session Ready
Exercise

ANOVA: Get the means for different subgroups

The analysis of variance approach might be used for testing for group differences in mean and/or in variance of a continuous variable that is split up into subsets by a categorical variable.

In R the by() function allows for computing the means for subsets of variables. In total, four arguments are used:

  • \(1^{st}\) argument: the variable for whose subsets means are computed.
  • \(2^{nd}\) argument: the factor variables that forms the subsets.
  • \(3^{rd}\) argument: the function to be applied on the subsets of the variable: mean.
  • \(4^{th}\) argument: na.rm is set to TRUE to remove missing values.

The use of the function is illustrated by the talent dataset. We would like to test for group differences in mean of the variable english that is split up into subsets by the categorial variable region. The variable region is a categorical variable with 9 levels and hence, the means are calculated for 9 groups of results for English (english).

Instructions
100 XP
  • Apply the by() function such that the means are computed of english that is split up into subsets by a categorical variable region.