Performing linear regression on each nested dataset
Now that you've divided the data for each country into a separate dataset in the data column, you need to fit a linear model to each of these datasets.
The map() function from purrr works by applying a formula to each item in a list, where . represents the individual item. For example, you could add one to each of a list of numbers:
map(numbers, ~ 1 + .)
This means that to fit a model to each dataset, you can do:
map(data, ~ lm(percent_yes ~ year, data = .))
where . represents each individual item from the data column in by_year_country. Recall that each item in the data column is a dataset that pertains to a specific country.
This exercise is part of the course
Case Study: Exploratory Data Analysis in R
Exercise instructions
- Load the
tidyrandpurrrpackages. - After nesting, use the
map()function within amutate()to perform a linear regression on each dataset (i.e. each item in thedatacolumn inby_year_country) modelingpercent_yesas a function ofyear. Save the results to themodelcolumn.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load tidyr and purrr
# Perform a linear regression on each item in the data column
by_year_country %>%
nest(-country)