Exercise

Calculating the p-values

In the video, you learned that a p-value measures the degree of disagreement between the data and the null hypothesis. Here, you will calculate the p-value for the original discrimination dataset as well as the small and big versions, disc_small and disc_big.

The original differences in proportions are available in your workspace, diff_orig, diff_orig_small, and diff_orig_big, as are the permuted datasets, disc_perm, disc_perm_small, and disc_perm_big.

Recall that you're only interested in the one-sided hypothesis test here. That is, you're trying to answer the question, "Are men more likely to be promoted than women?"

Instructions

100 XP
  • visualize() and get_p_value() using the built in infer functions. Remember that the null statistics are below the original difference, so the p-value (which represents how often a null value is more extreme) is calculated by counting the number of null values which are greater than the original difference.
  • Repeat for the small dataset, disc_perm_small, which has observed difference diff_orig_small.
  • Repeat for the big dataset, disc_perm_big, which has observed difference diff_orig_big.
  • You can test your knowledge by trying out: direction = "greater", direction = "two_sided", and direction = "less" before submitting your answer.