Get startedGet started for free

sapply

1. sapply

2. lapply()

Before, I talked about how lapply can be used to apply a function over each and every element of a list or a vector. The key thing of the lapply function is that its output is always a list. That's because an R function can return any R object. Also, the class of the R object it returns can differ depending on the input. When lapply is used to apply such a function over all elements in an input list or vector, it needs a list to store these results, because a list is able to contain heterogeneous content. However, you can think of many cases where the function always returns the same type of object over and over.

3. Cities: lapply()

Remember the cities example from the previous video? We had a vector, cities. We used lapply with the nchar function. As a result, we obtained a list with the length of each of the cities names. But wait! These values could very well fit into a simple vector as well! They all have the same type!

4. Cities: lapply()

We already tried to solve this by using the unlist function to convert a list to a vector as follows: But behold! there's an easier way to tackle the case in which all the results have the same type

5. Cities: sapply()

by using the sapply function. It's short for 'simplify apply'. Awesome, right? The result is a named vector, which contains the same information as the vector we obtained earlier using unlist and lapply together. Under the hood, something slightly more complex is going on. sapply calls lapply to apply the nchar function over each element of the cities vector, and then uses the simplify2array function to convert that list lapply generated to an array. In our case, sapply managed to convert the result to a one dimensional array, which is a vector. It's pretty awesome to see how R takes care of all this for us! On top of all that, sapply even found a sensible way of naming this vector.

6. Cities: sapply()

You can choose not to name the output of sapply, by setting its USE (dot) NAMES argument to FALSE. Voilá, city names are gone! Remember that USE (dot) NAMES is TRUE by default.

7. Cities: sapply()

Now, what would happen if the function you want to apply over the input, each time returns a vector containing two values instead of one? Let's find out with another example. The function first_and_last, that I've written for you, splits up a string to its letters, and then returns the 'minimum and maximum' letter according to alphabetical order: If we call this function on the character string "New York", the function returns a vector containing "e" and "Y". These are precisely the first and last letters of the word "New York" when ordered alphabetically.

8. Cities: sapply()

We can now use this function to apply it over every city name in cities. Instead of a vector, we now obtain a matrix, with 2 rows and 6 columns. Can you see how the output is generated? Notice here, that once again sapply assigns meaningful strings to the names of the columns and rows automatically.

9. Unable to simplify?

Both of my previous examples show the power of the sapply function to simplify the output of lapply, but what if this simplification is not possible? There are cases in which the function you want to apply does not always return a vector of the same length at all times. For these cases, simplification to a vector or a matrix just doesn't make sense. How does sapply respond to that? I've defined a function, unique_letters, that returns a vector of all the letters that are used inside a character string. If we try this function on the character string "London", we get a vector containing the unique letters in "London": "L", "o", "n" and "d".

10. Unable to simplify?

Let us first see how lapply behaves when we use unique_letters on the cities variable: As expected, we get a list containing vectors of single letters. We also see that the vectors have varying lengths, so trying to simplify this list could lead to pretty strange results. Let's see how sapply handles this. The result is the same as the lapply function, we get a list of vectors because R couldn't think of a meaningful way of simplifying the list of vectors. It was only able to give some meaningful names to the results. The fact that sapply simplifies when possible is quite handy, but it can also lead to problems. When writing a program, you might expect that the result of a sapply() function will be a vector where in fact it's still a list because simplification didn't work out!

11. Let's practice!

To solve this, one can also use the R function vapply, which we'll discuss in our next video. Before you learn about it, head over to the interactive exercises to see how your apply skills are progressing!