Exercise 16 - Calculate the P-value
The confidence interval tells us there is relatively strong pollster effect resulting in a difference of about 5%. Random variability does not seem to explain it.
Compute a p-value to relay the fact that chance does not explain the observed pollster effect.
This exercise is part of the course
HarvardX Data Science Module 4 - Inference and Modeling
Exercise instructions
- Use the
pnorm
function to calculate the probability that a random value is larger than the observed ratio of the estimate to the standard error. - Multiply the probability by 2, because this is the two-tailed test.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# We made an object `res` to summarize the average, standard deviation, and number of polls for the two pollsters.
res <- polls %>% group_by(pollster) %>%
summarize(avg = mean(spread), s = sd(spread), N = n())
# The variables `estimate` and `se_hat` contain the spread estimates and standard error, respectively.
estimate <- res$avg[2] - res$avg[1]
se_hat <- sqrt(res$s[2]^2/res$N[2] + res$s[1]^2/res$N[1])
# Calculate the p-value