Session Ready
Exercise

Exercise 5. Confidence interval for d

A much smaller proportion of the polls than expected produce confidence intervals containing \(p\). Notice that most polls that fail to include \(p\) are underestimating. The rationale for this is that undecided voters historically divide evenly between the two main candidates on election day.

In this case, it is more informative to estimate the spread or the difference between the proportion of two candidates \(d\), or \(0.482 - 0.461 = 0.021\) for this election.

Assume that there are only two parties and that \(d = 2p - 1\). Construct a 95% confidence interval for difference in proportions on election night.

Instructions
100 XP
  • Use the mutate function to define a new variable called 'd_hat' in polls as the proportion of Clinton voters minus the proportion of Trump voters.
  • Extract the sample size N from the first poll in your subset object polls.
  • Extract the difference in proportions of voters d_hat from the first poll in your subset object polls.
  • Use the formula above to calculate \(p\) from d_hat. Assign \(p\) to the variable X_hat.
  • Find the standard error of the spread given N. Save this as se_hat.
  • Calculate the 95% confidence interval of this estimate of the difference in proportions, d_hat, using the qnorm function.
  • Save the lower and upper confidence intervals as an object called ci. Save the lower confidence interval first.