Randomizing gender discrimination

Recall that we are considering a situation where the number of men and women are fixed (representing the resumes) and the number of people promoted is fixed (the managers were able to promote only 35 individuals).

In this exercise, you'll create a randomization distribution of the null statistic with 1000 replicates as opposed to just 5 in the previous exercise. As a reminder, the statistic of interest is the difference in proportions promoted between genders (i.e. proportion for males minus proportion for females). From the original dataset, you can calculate how the promotion rates differ between males and females. Using the specify-hypothesis-generate-calculate workflow in infer, you can calculate the same statistic, but instead of getting a single number, you get a whole distribution. In this exercise, you'll compare that single number from the original dataset to the distribution made by the simulation.

This exercise is part of the course

Foundations of Inference in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Calculate the observed difference in promotion rate
diff_orig <- disc %>%
  # Group by sex
  group_by(___) %>%
  # Summarize to calculate fraction promoted
  ___(prop_prom = ___(___)) %>%
  # Summarize to calculate difference
  ___(stat = ___(___)) %>% 
  pull()
    
# See the result
diff_orig

Edit and Run Code