Inference with and without outlier (randomization)

Using the randomization test, you can again evaluate the effect of an outlier on the inferential conclusions of a linear model. Run a randomization test on the hypdata_out data twice: once with the outlying value and once without it. Note that the extended lines of code communicate clearly the steps of the randomization tests.

Cet exercice fait partie du cours

Inference for Linear Regression in R

Afficher le cours

Instructions

Using the data frames hypdata_out (containing an outlier) and hypdata_noout (outlier removed), the data frames perm_slope_out and perm_slope_noout were created to contain the permuted slopes the original datasets, respectively. The observed values are stored in the variables obs_slope_out and obs_slope_noout.

Find the p-values by finding the proportion of ( absolute value) permuted slopes which are larger than or equal to the ( absolute value of the) observed slopes. As before, use mean on the binary inequality to find the proportion.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Calculate the p-value with the outlier
perm_slope_out %>% 
  mutate(abs_perm_slope = ___) %>%
  summarize(p_value = ___)

# Calculate the p-value without the outlier
perm_slope_noout %>% 
  mutate(abs_perm_slope = ___) %>%
  summarize(p_value = ___)

Modifier et exécuter le code