Exercise 15 - Calculate the 95% Confidence Interval of the Spreads
We have constructed a random variable that has expected value \(b_2 - b_1\), the pollster bias difference. If our model holds, then this random variable has an approximately normal distribution. The standard error of this random variable depends on \(\sigma_1\) and \(\sigma_2\), but we can use the sample standard deviations we computed earlier. We have everything we need to answer our initial question: is \(b_2 - b_1\) different from 0?
Construct a 95% confidence interval for the difference \(b_2\) and \(b_1\). Does this interval contain zero?
This exercise is part of the course
HarvardX Data Science Module 4 - Inference and Modeling
Exercise instructions
- Use pipes
%>%
to pass the datapolls
on to functions that will group by pollster and summarize the average spread, standard deviation, and number of polls per pollster. - Calculate the estimate by subtracting the average spreads. Save this estimate to a variable called
estimate
. - Calculate the standard error using the standard deviations of the spreads and the sample size. Save this value to a variable called
se_hat
. - Calculate the 95% confidence intervals using the
qnorm
function. Save the lower and then the upper confidence interval to a variable calledci
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# The `polls` data have already been loaded for you. Use the `head` function to examine them.
head(polls)
# Create an object called `res` that summarizes the average, standard deviation, and number of polls for the two pollsters.
# Store the difference between the larger average and the smaller in a variable called `estimate`. Print this value to the console.
# Store the standard error of the estimates as a variable called `se_hat`. Print this value to the console.
# Calculate the 95% confidence interval of the spreads. Save the lower and then the upper confidence interval to a variable called `ci`.