Session Ready
Exercise

Two-way Anova (4)

The inclusion of multiple variables in an Anova allows you to examine another interesting phenomenon: the interaction effect. If there is an interaction between two factor variables, it means that the effect of either factor on the response variable is not the same at each category of the other factor. The easiest way to understand an interaction effect at play is to visualize the effect.

The visualization above, contains 3 parts. The left top visualization contains the main effect of genre on song duration. As you can see, the pop and hip-hop genres are fairly similar to each other but a classical song has a much longer duration on average. This is where the significance of the genre variable comes from. The left bottom visualization contains the main effect of continent. As you can see, North American songs have on average a longer duration than European songs. However, if you look at the right most plot, you see that this pattern holds for North American and European classical and hip-hop songs but not for pop songs. European pop songs have a higher song duration than American pop songs. This is exactly what is meant with the statement that the effect of either factor on the response variable is not the same at each category of the other factor.

In R, you include an an interaction term in your model by putting a colon between your first and second variables, like so:

independent_variable1:independent_variable2

You would then have to add this term into your anova function like so:

aov(dependent_variable ~ independent_variable1 + independent_variable2 + independent_variable1:independent_variable2)
Instructions
100 XP
  • For the current exercise, all our data is available in the dataframe song_data. Conduct a two-way Anova using the aov() function with an interaction between the variables genre and continent and store the anova model in the object two_way_fit
  • Call the summary function on your your two_way_fit object and print the output to the console.