CommencerCommencer gratuitement

Null sampling distribution of the slope

In the previous chapter, you investigated the sampling distribution of the slope from a population where the slope was non-zero. Typically, however, to do inference, you will need to know the sampling distribution of the slope under the hypothesis that there is no relationship between the explanatory and response variables. Additionally, in most situations, you don't know the population from which the data came, so the null sampling distribution must be derived from only the original dataset.

In the mid-20th century, a study was conducted that tracked down identical twins that were separated at birth: one child was raised in the home of their biological parents and the other in a foster home. In an attempt to answer the question of whether intelligence is the result of nature or nurture, both children were given IQ tests. The resulting data is given for the IQs of the foster twins (Foster is the response variable) and the IQs of the biological twins (Biological is the explanatory variable).

In this exercise you'll use the pull() function. This function takes a data frame and returns a selected column as a vector (similar to $).

Cet exercice fait partie du cours

Inference for Linear Regression in R

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

library(infer)

# Calculate the observed slope
# Run a lin. reg. of Foster vs. Biological on the twins data
obs_slope <- ___(___, ___) %>%
  # Tidy the result
  ___() %>%   
  # Filter for rows where term equal Biological
  ___(___) %>%
  # Pull out the estimate column
  ___(___) 

# See the result
obs_slope
Modifier et exécuter le code