Exercise 8. Comparing to actual results by pollster
Although the proportion of confidence intervals that include the actual difference between the proportion of voters increases substantially, it is still lower that 0.95. In the next chapter, we learn the reason for this.
To motivate our next exercises, calculate the difference between each poll's estimate \(\bar{d}\) and the actual \(d=0.021\). Stratify this difference, or error, by pollster in a plot.
This exercise is part of the course
HarvardX Data Science Module 4 - Inference and Modeling
Exercise instructions
- Define a new variable
errors
that contains the difference between the estimated difference between the proportion of voters and the actual difference on election day, 0.021. - To create the plot of errors by pollster, add a layer with the function
geom_point
. The aesthetic mappings require a definition of the x-axis and y-axis variables. So the code looks like the example below, but you fill in the variables for x and y. - The last line of the example code adjusts the x-axis labels so that they are easier to read.
data %>% ggplot(aes(x = , y = )) +
geom_point() +
theme(axis.text.x = element_text(angle = 90, hjust = 1))
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The `polls` object has already been loaded. Examine it using the `head` function.
head(polls)
# Add variable called `error` to the object `polls` that contains the difference between d_hat and the actual difference on election day. Then make a plot of the error stratified by pollster.