Get startedGet started for free

Exercise 8. Comparing to actual results by pollster

Although the proportion of confidence intervals that include the actual difference between the proportion of voters increases substantially, it is still lower that 0.95. In the next chapter, we learn the reason for this.

To motivate our next exercises, calculate the difference between each poll's estimate \(\bar{d}\) and the actual \(d=0.021\). Stratify this difference, or error, by pollster in a plot.

This exercise is part of the course

HarvardX Data Science Module 4 - Inference and Modeling

View Course

Exercise instructions

  • Define a new variable errors that contains the difference between the estimated difference between the proportion of voters and the actual difference on election day, 0.021.
  • To create the plot of errors by pollster, add a layer with the function geom_point. The aesthetic mappings require a definition of the x-axis and y-axis variables. So the code looks like the example below, but you fill in the variables for x and y.
  • The last line of the example code adjusts the x-axis labels so that they are easier to read.
data %>% ggplot(aes(x = , y = )) +
  geom_point() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The `polls` object has already been loaded. Examine it using the `head` function.
head(polls)

# Add variable called `error` to the object `polls` that contains the difference between d_hat and the actual difference on election day. Then make a plot of the error stratified by pollster.
Edit and Run Code