Get startedGet started for free

Between group sum of squares

To calculate the F-value, you need to calculate the ratio between the variance between groups and the variance within groups. Furthermore, to calculate the variance (i.e. mean of squares), you first have to calculate the sum of squares.

Let's start with the between group sum of squares. The formula for the calculation of the between group sum of squares is

$$\begin{aligned} ss_a & = n \sum(y_j - y_t)^2 \end{aligned}$$

where \(y_j\) are the group means, \(y_t\) is the grand mean, and \(n\) is the number of items in each group.

Now, remember that the working memory experiment investigates the relationship between the change in IQ and the number of training sessions. Calculate the between group sum of squares for the data from this experiment. wm is still loaded in your workspace.

This exercise is part of the course

Intro to Statistics with R: Analysis of Variance (ANOVA)

View Course

Exercise instructions

  • Determine the number of subjects in each group and store the result in n. If you don't know the number of subjects in each group, you can always print the data to the console or use more creative means of figuring it out.
  • Use tapply() to compute the group means and save the result to y_j. tapply() allows you to perform an operation on iq once for each level of cond. Consequently, it can calculate each group mean. The first argument should contain the data column of data for which you want to calculate the means and the second argument should contain the column containing information on which group each subject belongs to.
  • Compute the grand mean and assign the result to y_t. This is just the mean of all IQ gains in the data.
  • You now have all the ingredients to calculate the between group sum of squares by applying the formula. Save this in the variable ss_a

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Define number of subjects in each group
n <- ___

# Calculate group means
y_j <- tapply(___, ___, mean)

# Calculate the grand mean
y_t <- ___

# Calculate the sum of squares
ss_a <- ___
Edit and Run Code