Inference with and without outlier (randomization)
Using the randomization test, you can again evaluate the effect of an outlier on the inferential conclusions of a linear model. Run a randomization test on the hypdata_out
data twice: once with the outlying value and once without it. Note that the extended lines of code communicate clearly the steps of the randomization tests.
Cet exercice fait partie du cours
Inference for Linear Regression in R
Instructions
Using the data frames hypdata_out
(containing an outlier) and hypdata_noout
(outlier removed), the data frames perm_slope_out
and perm_slope_noout
were created to contain the permuted slopes the original datasets, respectively. The observed values are stored in the variables obs_slope_out
and obs_slope_noout
.
- Find the p-values by finding the proportion of (
abs
olute value) permuted slopes which are larger than or equal to the (abs
olute value of the) observed slopes. As before, usemean
on the binary inequality to find the proportion.
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Calculate the p-value with the outlier
perm_slope_out %>%
mutate(abs_perm_slope = ___) %>%
summarize(p_value = ___)
# Calculate the p-value without the outlier
perm_slope_noout %>%
mutate(abs_perm_slope = ___) %>%
summarize(p_value = ___)