Testing Assumptions I
Before we run a regression analysis we need to check whether several assumptions are met. A lot of these assumptions involve residuals.
Have a look at the graph. The circles represent data points - or how much someone liked you after you gave them a specific amount of money in reality. The line going through the graph is the slope, which gives an estimated amount amount that someone liked you after you gave them money based on the slope equation. The amount of space between a data point and the line is known as the residual, and its using the values of these residuals that we can test assumptions.
One way to get these would be to count them manually. Fortunately, it's not the 1800's anymore, and we can make R do this for us! In the same way as we used $
to index columns and rows from dataframes and matrixes (e.g. data$variable1
would give you the variable named variable1
), we can use this indexing to obtain the values of the residuals from our regression model that we stored as mod1
.
This exercise is part of the course
Inferential Statistics
Exercise instructions
-
lm()
automatically calculates the residuals for your model, which are saved inmod1
after. - In your script, add code to obtain the residuals (named
residuals
) frommod1
using$
and assign these toresmod1
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Vector containing the amount of money you gave participants
money <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
# Vector containing the amount the participants liked you
liking <- c(2.2, 2.8, 4.5, 3.1, 8.7, 5.0, 4.5, 8.8, 9.0, 9.2)
# Assign regression model to variable "mod1"
mod1 <- lm(liking ~ money)
# Obtain the residuals from mod1 using $, assign to "resmod1"
resmod1 <-
# Print the residuals
resmod1