Get startedGet started for free

Exercise 16 - Calculate the P-value

The confidence interval tells us there is relatively strong pollster effect resulting in a difference of about 5%. Random variability does not seem to explain it.

Compute a p-value to relay the fact that chance does not explain the observed pollster effect.

This exercise is part of the course

HarvardX Data Science Module 4 - Inference and Modeling

View Course

Exercise instructions

  • Use the pnorm function to calculate the probability that a random value is larger than the observed ratio of the estimate to the standard error.
  • Multiply the probability by 2, because this is the two-tailed test.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# We made an object `res` to summarize the average, standard deviation, and number of polls for the two pollsters.
res <- polls %>% group_by(pollster) %>% 
  summarize(avg = mean(spread), s = sd(spread), N = n()) 

# The variables `estimate` and `se_hat` contain the spread estimates and standard error, respectively.
estimate <- res$avg[2] - res$avg[1]
se_hat <- sqrt(res$s[2]^2/res$N[2] + res$s[1]^2/res$N[1])

# Calculate the p-value
Edit and Run Code