Get startedGet started for free

Step-by-step through the permutation

To help you understand the code used to create the randomization distribution, this exercise will walk you through the steps of the infer framework. In particular, you'll see how differences in the generated replicates affect the calculated statistics.

After running the infer steps, be sure to notice that the numbers are slightly different for each replicate.

This exercise is part of the course

Foundations of Inference in R

View Course

Exercise instructions

The dplyr and infer packages have been loaded for you, along with the disc data frame from the last exercise.

  • Call the functions for the first three steps. The work has been done for you, your job is to investigate the results of calling the first three infer steps.
  • In order to see the effect of permuting,
    • group the permuted data frame, disc_perm, by the new replicate variable, then
    • count the variables of interest (promote within each sex) using count().
  • Using disc_perm, calculate() the statistic of interest. Set stat to "diff in props" and order to c("male", "female").

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Replicate the entire data frame, permuting the promote variable
disc_perm <- disc %>%
  specify(promote ~ sex, success = "promoted") %>%
  hypothesize(null = "independence") %>%
  generate(reps = 5, type = "permute")

disc_perm %>%
  # Group by replicate
  ___ %>%
  # Count per group
  ___

disc_perm %>%
  # Calculate difference in proportion, male then female
  ___
Edit and Run Code