Inference with and without outlier (randomization)
Using the randomization test, you can again evaluate the effect of an outlier on the inferential conclusions of a linear model. Run a randomization test on the hypdata_out
data twice: once with the outlying value and once without it. Note that the extended lines of code communicate clearly the steps of the randomization tests.
Este ejercicio forma parte del curso
Inference for Linear Regression in R
Instrucciones del ejercicio
Using the data frames hypdata_out
(containing an outlier) and hypdata_noout
(outlier removed), the data frames perm_slope_out
and perm_slope_noout
were created to contain the permuted slopes the original datasets, respectively. The observed values are stored in the variables obs_slope_out
and obs_slope_noout
.
- Find the p-values by finding the proportion of (
abs
olute value) permuted slopes which are larger than or equal to the (abs
olute value of the) observed slopes. As before, usemean
on the binary inequality to find the proportion.
Ejercicio interactivo práctico
Prueba este ejercicio completando el código de muestra.
# Calculate the p-value with the outlier
perm_slope_out %>%
mutate(abs_perm_slope = ___) %>%
summarize(p_value = ___)
# Calculate the p-value without the outlier
perm_slope_noout %>%
mutate(abs_perm_slope = ___) %>%
summarize(p_value = ___)