Code a simple one-variable regression
For the first coding exercise, you'll create a formula to define a one-variable modeling task, and then fit a linear model to the data. You are given the rates of male and female unemployment in the United States over several years (Source).
The task is to predict the rate of female unemployment from the observed rate of male unemployment.
The outcome is female_unemployment
, and the input is male_unemployment
.
The sign of the variable coefficient tells you whether the outcome increases (+) or decreases (-) as the variable increases.
Recall the calling interface for lm()
(docs) is:
lm(formula, data = ___)
The unemployment
data frame has been pre-loaded.
This exercise is part of the course
Supervised Learning in R: Regression
Exercise instructions
- Define a formula that expresses
female_unemployment
as a function ofmale_unemployment
. Assign the formula to the variablefmla
and print it. - Then use
lm()
andfmla
to fit a linear model to predict female unemployment from male unemployment using the datasetunemployment
. - Print the model. Is the coefficient for male unemployment consistent with what you would expect? Does female unemployment increase as male unemployment does?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# unemployment is available
summary(unemployment)
# Define a formula to express female_unemployment as a function of male_unemployment
fmla <- ___
# Print it
___
# Use the formula to fit a model: unemployment_model
unemployment_model <- ___
# Print it
___