Summarizing opportunity cost (2)

Now that you've created the randomization distribution, you'll use it to assess whether the observed difference in proportions is consistent with the null difference. You will measure this consistency (or lack thereof) with a p-value, or the proportion of permuted differences less than or equal to the observed difference.

The permuted dataset and the original observed statistic are available in your workspace as opp_perm and diff_orig respectively.

visualize and get_p_value using the built in infer functions. Remember that the null statistics are above the original difference, so the p-value (which represents how often a null value is more extreme) is calculated by counting the number of null values which are less than the original difference.

First visualize the sampling distribution of the permuted statistics indicating the place where obs_stat = diff_orig, and coloring in values below with the command direction = "less".
Then get_p_value is calculated as the proportion of permuted statistics which are direction = "less" than obs_stat = diff_orig.
As an alternative way to calculate the p-value, use summarize() and mean() to find the proportion of times the permuted differences in opp_perm (called stat) are less than or equal to the observed difference (called diff_orig).
You can test your knowledge by trying out: direction = "greater", direction = "two_sided", and direction = "less" before submitting your answer to both visualize and get_p_value.

Introduction to ideas of inference

Completing a randomization test: gender discrimination

Hypothesis testing errors: opportunity cost

Confidence intervals

Exercise

Summarizing opportunity cost (2)

Instructions