CommencerCommencer gratuitement

Randomizing gender discrimination

Recall that we are considering a situation where the number of men and women are fixed (representing the resumes) and the number of people promoted is fixed (the managers were able to promote only 35 individuals).

In this exercise, you'll create a randomization distribution of the null statistic with 1000 replicates as opposed to just 5 in the previous exercise. As a reminder, the statistic of interest is the difference in proportions promoted between genders (i.e. proportion for males minus proportion for females). From the original dataset, you can calculate how the promotion rates differ between males and females. Using the specify-hypothesis-generate-calculate workflow in infer, you can calculate the same statistic, but instead of getting a single number, you get a whole distribution. In this exercise, you'll compare that single number from the original dataset to the distribution made by the simulation.

Cet exercice fait partie du cours

Foundations of Inference in R

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Calculate the observed difference in promotion rate
diff_orig <- disc %>%
  # Group by sex
  group_by(___) %>%
  # Summarize to calculate fraction promoted
  ___(prop_prom = ___(___)) %>%
  # Summarize to calculate difference
  ___(stat = ___(___)) %>% 
  pull()
    
# See the result
diff_orig
Modifier et exécuter le code