Explaining the anova function
In the last exercise we saw that both our assumption of normality and the assumption of homogeneity of variance were not met. Usually this would mean that we would perform a non-parametric test. These test will be illustrated during the next week. However, during this lab we will continue to do an analysis of variance and will interpret the output as though our assumptions are met.
In our you can use two functions to perform an analysis of variance: the aov()
function and the lm()
function. There are very few differences between the two functions. However, the main difference is the output that each of these functions produces. The aov()
function produces the more traditional anova output and may seem more familiar if you are coming from a statistical software package like SPSS.
This exercise is part of the course
Inferential Statistics
Exercise instructions
- Peform an anova using the
aov()
function with genre as the independent variable and song duration as the dependent variable. If y is your dependent variable and x is your independent variable, you could perform an anova like so:aov(y ~ x)
. Store the result of your anova in a variable calledfit_aov
. Note that our data is available in thesong_data
dataframe. - Use the
summary()
function on the thefit_aov
variable and print the output to the console. You can just provide yourfit_aov
object as the argument to thesummary()
function. - Do an anova using the
lm()
function with genre as independent variable and song duration as the dependent variable. Store the result in a variable calledfit_lm
. - Use the
summary()
function on the variablefit_lm
and print the output to the console
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# use the aov function and store the result in fit_aov
# use the summary function on the object fit_aov
# use the lm function and store the result in fit_lm
# use the summary function on the object fit_lm