Get startedGet started for free

Calculating the p-values

In the video, you learned that a p-value measures the degree of disagreement between the data and the null hypothesis. Here, you will calculate the p-value for the original discrimination dataset as well as the small and big versions, disc_small and disc_big.

The original differences in proportions are available in your workspace, diff_orig, diff_orig_small, and diff_orig_big, as are the permuted datasets, disc_perm, disc_perm_small, and disc_perm_big.

Recall that you're only interested in the one-sided hypothesis test here. That is, you're trying to answer the question, "Are men more likely to be promoted than women?"

This exercise is part of the course

Foundations of Inference in R

View Course

Exercise instructions

  • visualize() and get_p_value() using the built in infer functions. Remember that the null statistics are below the original difference, so the p-value (which represents how often a null value is more extreme) is calculated by counting the number of null values which are greater than the original difference.
  • Repeat for the small dataset, disc_perm_small, which has observed difference diff_orig_small.
  • Repeat for the big dataset, disc_perm_big, which has observed difference diff_orig_big.
  • You can test your knowledge by trying out: direction = "greater", direction = "two_sided", and direction = "less" before submitting your answer.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Visualize and calculate the p-value for the original dataset
disc_perm %>%
  ___(obs_stat = ___, direction = "___")

disc_perm %>%
  ___(___, ___)

# Visualize and calculate the p-value for the small dataset
___ %>%
  ___(___, ___)

___ %>%
  ___(___, ___)

# Visualize and calculate the p-value for the big dataset
___ %>%
  ___(___, ___)

___ %>%
  ___(___, ___)
Edit and Run Code