1. Learn
  2. /
  3. Courses
  4. /
  5. HarvardX Data Science Module 4 - Inference and Modeling

Exercise

Exercise 1. Confidence interval for p

For the following exercises, we will use actual poll data from the 2016 election. The exercises will contain pre-loaded data from the dslabs package.

library(dslabs)
data("polls_us_election_2016")

We will use all the national polls that ended within a few weeks before the election.

Assume there are only two candidates and construct a 95% confidence interval for the election night proportion \(p\).

Instructions

100 XP
  • Use filter to subset the data set for the poll data you want. Include polls that ended on or after October 31, 2016 (enddate). Only include polls that took place in the United States. Call this filtered object polls.
  • Use nrow to make sure you created a filtered object polls that contains the correct number of rows.
  • Extract the sample size N from the first poll in your subset object polls.
  • Convert the percentage of Clinton voters (rawpoll_clinton) from the first poll in polls to a proportion, X_hat. Print this value to the console.
  • Find the standard error of X_hat given N. Print this result to the console.
  • Calculate the 95% confidence interval of this estimate using the qnorm function.
  • Save the lower and upper confidence intervals as an object called ci. Save the lower confidence interval first.