Explaining the anova function

In the last exercise we saw that both our assumption of normality and the assumption of homogeneity of variance were not met. Usually this would mean that we would perform a non-parametric test. These test will be illustrated during the next week. However, during this lab we will continue to do an analysis of variance and will interpret the output as though our assumptions are met.

In our you can use two functions to perform an analysis of variance: the aov() function and the lm() function. There are very few differences between the two functions. However, the main difference is the output that each of these functions produces. The aov() function produces the more traditional anova output and may seem more familiar if you are coming from a statistical software package like SPSS.

This exercise is part of the course

Inferential Statistics

View Course

Exercise instructions

  • Peform an anova using the aov() function with genre as the independent variable and song duration as the dependent variable. If y is your dependent variable and x is your independent variable, you could perform an anova like so: aov(y ~ x). Store the result of your anova in a variable called fit_aov. Note that our data is available in the song_data dataframe.
  • Use the summary() function on the the fit_aov variable and print the output to the console. You can just provide your fit_aov object as the argument to the summary() function.
  • Do an anova using the lm() function with genre as independent variable and song duration as the dependent variable. Store the result in a variable called fit_lm.
  • Use the summary() function on the variable fit_lm and print the output to the console

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# use the aov function and store the result in fit_aov


# use the summary function on the object fit_aov


# use the lm function and store the result in fit_lm


# use the summary function on the object fit_lm