Summarizing opportunity cost (2)
Now that you've created the randomization distribution, you'll use it to assess whether the observed difference in proportions is consistent with the null difference. You will measure this consistency (or lack thereof) with a p-value, or the proportion of permuted differences less than or equal to the observed difference.
The permuted dataset and the original observed statistic are available in your workspace as opp_perm
and diff_orig
respectively.
visualize
and get_p_value
using the built in infer
functions. Remember that the null statistics are above the original difference, so the p-value (which represents how often a null value is more extreme) is calculated by counting the number of null values which are less
than the original difference.
This exercise is part of the course
Foundations of Inference in R
Exercise instructions
- First
visualize
the sampling distribution of the permuted statistics indicating the place whereobs_stat = diff_orig
, and coloring in values below with the commanddirection = "less"
. - Then
get_p_value
is calculated as the proportion of permuted statistics which aredirection = "less"
thanobs_stat = diff_orig
. - As an alternative way to calculate the p-value, use
summarize()
andmean()
to find the proportion of times the permuted differences inopp_perm
(calledstat
) are less than or equal to the observed difference (calleddiff_orig
). - You can test your knowledge by trying out:
direction = "greater"
,direction = "two_sided"
, anddirection = "less"
before submitting your answer to bothvisualize
andget_p_value
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Visualize the statistic
opp_perm %>%
___(___, ___)
# Calculate the p-value using `get_p_value`
opp_perm %>%
___(___, ___)
# Calculate the p-value using `summarize`
opp_perm %>%
summarize(p_value = ___)