1. Subsetting lists
Lists require a different set of grammar than dataframes or vectors. By the end of this lesson, you will be able to easily subset any list, named or unnamed.
2. Let's talk about lists!
Many users find working with vectors and dataframes fairly similar to how they interact with data in other programs. Unlike dataframes and vectors, lists can store different data types. In this example, the list called L-O holds a dataframe, which contains data about three birds. The next element in the list could be anything; it doesn't have to be another dataframe. We could store a model or a plot here.
3. Indexing data frames and lists
For dataframes, we can index two ways. First, we can use a pair of square brackets with the comma. Information about the row goes on the left side of the comma. In this example, we are indexing the first row. On the right side of the comma is the information about the column. Here we are using the name of one of the columns.
The second way to index dataframes is with the dollar sign and the name of the column.
4. Indexing dataframes and lists
List indexing is one key place where lists differ from dataframes and vectors. List indexing uses square brackets, just like dataframes and vectors, but in a different way.
First, we can use double square brackets with a number to subset any list. Here we are subsetting out the second element of the L-O list.
Second, if a list is named, we can put the name of the element in the double square brackets to index a particular element. Here were are subsetting out the model element of the L-O list.
5. Calculate something on each element without purrr
Let's walk through another example where we compare for loops and purrr functions to solve a problem. We want to know how many rows are in each element of a list. In this case, each element of our list, survey underscore data, is the results of two weeks of counts of frogs from wetlands on Lake Erie. Each element should have 14 rows, one row for each day in the two-week survey.
To check on the number of rows in each element without purrr, we first create a new dataframe to store our results in. This new dataframe, called df underscore rows, has two columns, one called names, which contains the names from our list, and one called rows, which is currently empty. This is where we will put the output from our for loop.
Then we write a for loop where each iteration takes the element from the list and puts it into the function nrow(). nrow() counts the number of rows in a dataframe. The result from nrow() is put into the next row of the dataframe, in the rows column. This works fine, but leaves lots of room for typos and is a lot more code than we need.
6. Calculate something on each element with purrr
Now let's try to check the number of rows in each element of the survey_data list using the purrr function map().
Remember that survey data is a list where each element is a dataframe, and each should only have 14 rows.
We will be using the map() function to replicate the for loop from the previous slide. The map() function takes two arguments. The first argument is the list object, in this case the survey_data list.
The second argument is the function we want to iterate each element through, in this case, the nrow() function. We put the function after the tilde symbol. Then we put dot x into the function to show map() where we want the element to be inputted.
One possible issue here is that the output of map() is another list. When we solved this problem with a for loop we got a dataframe. We've got our answer here, but it might not be in the form we want.
7. Let's purrr-actice!
But before we learn how to get different outputs with purrr, let's have you take a swing at map() again.