Get startedGet started for free

Exercise 3 - Stratify by Pollster and Grade

Now find the proportion of hits for each pollster. Show only pollsters with at least 5 polls and order them from best to worst. Show the number of polls conducted by each pollster and the FiveThirtyEight grade of each pollster.

This exercise is part of the course

HarvardX Data Science Module 4 - Inference and Modeling

View Course

Exercise instructions

  • Create an object called p_hits that contains the proportion of intervals that contain the actual spread using the following steps.
  • Use the mutate function to create a new variable called hit that contains a logical vector for whether the actual_spread falls between the lower and upper confidence intervals.
  • Use the group_by function to group the data by pollster.
  • Use the filter function to filter for pollsters that have at least 5 polls.
  • Summarize the proportion of values in hit that are true as a variable called proportion_hits. Also create new variables for the number of polls by each pollster (n) using the n() function and the grade of each poll (grade) by taking the first row of the grade column.
  • Use the arrange function to arrange the proportion_hits in descending order.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# The `cis` data have already been loaded for you
add <- results_us_election_2016 %>% mutate(actual_spread = clinton/100 - trump/100) %>% select(state, actual_spread)
ci_data <- cis %>% mutate(state = as.character(state)) %>% left_join(add, by = "state")

# Create an object called `p_hits` that summarizes the proportion of hits for each pollster that has at least 5 polls.
Edit and Run Code