Session Ready
Exercise

Randomizing gender discrimination

Recall that we are considering a situation where the number of men and women are fixed (representing the resumes) and the number of people promoted is fixed (the managers were able to promote only 35 individuals).

In this exercise, you'll create a randomization distribution of the null statistic with 1000 replicates as opposed to just 5 in the previous exercise. As a reminder, the statistic of interest is the difference in proportions promoted between genders (i.e. proportion for males minus proportion for females). From the original dataset, you can calculate how the promotion rates differ between males and females. Using the specify-hypothesis-generate-calculate workflow in infer, you can calculate the same statistic, but instead of getting a single number, you get a whole distribution. In this exercise, you'll compare that single number from the original dataset to the distribution made by the simulation.

Instructions 1/3
undefined XP
  • 1
  • 2
  • 3
  • Calculate the observed difference in promotion rate.
    • Group by sex.
    • Calculate the fraction promoted for each sex by summarizing on the mean() of promote == "promoted". Call the summary variable prop_prom.
    • Calculate the difference in fractions between sexes by summarizing again, setting stat to the diff() of prop_prom.