Get startedGet started for free

Null sampling distribution of the slope

In the previous chapter, you investigated the sampling distribution of the slope from a population where the slope was non-zero. Typically, however, to do inference, you will need to know the sampling distribution of the slope under the hypothesis that there is no relationship between the explanatory and response variables. Additionally, in most situations, you don't know the population from which the data came, so the null sampling distribution must be derived from only the original dataset.

In the mid-20th century, a study was conducted that tracked down identical twins that were separated at birth: one child was raised in the home of their biological parents and the other in a foster home. In an attempt to answer the question of whether intelligence is the result of nature or nurture, both children were given IQ tests. The resulting data is given for the IQs of the foster twins (Foster is the response variable) and the IQs of the biological twins (Biological is the explanatory variable).

In this exercise you'll use the pull() function. This function takes a data frame and returns a selected column as a vector (similar to $).

This exercise is part of the course

Inference for Linear Regression in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

library(infer)

# Calculate the observed slope
# Run a lin. reg. of Foster vs. Biological on the twins data
obs_slope <- ___(___, ___) %>%
  # Tidy the result
  ___() %>%   
  # Filter for rows where term equal Biological
  ___(___) %>%
  # Pull out the estimate column
  ___(___) 

# See the result
obs_slope
Edit and Run Code