1. For loop
The for loop is somewhat different from the while loop. Have a look at this 'recipe'.
2. for loop
This can be read as: for each var, a variable, in seq, a sequence, execute expressions. Makes sense? Let's see how this actually works with an example. Suppose you have a vector, cities, containing the names of a number of cities.
We can simply print the cities vector to the console.
But suppose we want to have a different printout for every element in the vector. We can accomplish this using a for loop.
3. for loop
Let's start from the recipe and convert it to a functional for loop step by step.
4. for loop
Inside the parentheses, we write 'city in cities', meaning that we want to execute the code in the expression block for every city in the cities vector.
5. for loop
We'll simply replace the expression by a simple print statement for starters.
How does R handle this code? At the start of the loop, R evaluates the seq element, being cities in our case. It realizes that it is a vector containing 6 elements. Next, R stores the first element of this sequence in the variable city, so city equals "New York" now.
6. for loop
Then, the expression, print(city), is executed, printing out "New York" to the console.
7. for loop
After the execution, R stores the second element of the cities vector, "Paris", in city and re-runs the code. This process repeats itself until all cities in the cities vector are iterated over.
8. for loop
The final result looks like this: for each city, a separate printout was done.
9. for loop over list
The for loop does not only work on vectors: it also works with lists for example. Suppose that the cities vector is a list instead of a vector:
The exact same for loop as we've been using before can be used for lists, and the result is exactly the same.
So there's no need to worry about the difference between subsetting vectors and lists, because the for loop does this for us. I would encourage you to try the for loop with different data structures as well, such as matrices and data frames. I won't go into detail on these in this video.
Instead, I want to talk about two control statements for loops. The first one is break, and the second one is next.
10. break statement
The break statement is a statement that you already know: just like in the while loop, break in a for loop simply stops the execution of the code and abandons the for loop altogether. Suppose we want to leave the for loop as soon as we encounter a city that consists of 6 characters. We can use the nchar function, which stands for number of characters, inside an if statement for this:
How will R deal with this code? Well, for the first city in the cities vector, "New York", the nchar condition is false, so the "New York" still gets printed to the console. The same happens for "Paris". But in the third iteration, when city is equal to "London", the nchar condition is TRUE, causing the for loop to break. Since the break construct comes before the print command, the character string "London" is not printed to the console anymore.
11. break statement
If we run the code, we see that indeed, only "New York" and "Paris" get printed to the console, after which the for loop is abandoned.
12. next statement
The next statement also alters the flow of your for loop, but does so in a slightly different way. Let's see what happens if we change the break statement by the next statement and execute the entire loop again.
All city names except for "London" get printed to the console. How could this happen? Because the next statement skips the remainder of the code inside the for loop and proceeds to the next iteration. So as soon as next is encountered, the print(city) part is not processed and the for loop is continued. Of course it is perfectly possible to use both break and next in the for loop.
13. for loop: v2
Before you can have some more looping fun in the exercises, I want to talk about another way we can loop over different data structures. Let's retake the basic for loop that prints the city names that are stored in a vector.
Suppose that instead of simply printing out the city's name, we also want to give information on the city's position inside the vector. We can't use this construct, given that we don't have access to the so-called looping index. This index is a counter that R uses behind the scenes to know which element to select on every iteration. In the first iteration, the looping index is 1, and the first element of the cities vector is selected. But what if we want to use this looping vector ourselves? There's no way for us to access it. Fortunately, we can easily solve this. Instead of iterating over the cities, we can manually create a looping index ourselves. Let's start with changing the looping details.
14. for loop: v2
Now, we let i progress from 1 to length of the cities vector, which is 6, by steps of 1. Remember that 1 colon 6 is a compact way of coding the a vector containing the elements 1, 2, 3, 4, 5 and 6.
By using a manual looping index, we lose our city variable, so we have to change the contents of the for loop as well.
15. for loop: v2
We now do the subsetting of the vector explicitly, using square brackets. The result is exactly the same as before. This might seem a bit more work, but we now gain access to the index as well.
16. for loop: v2
Adding some more information is easier now.
17. for loop: wrap-up
I can imagine that you're wondering, "Which one of the two is best?" It depends. The first one, the city in cities version, is typically more concise and easier to read, but does not give access to all looping information. The version with the explicit looping index takes more thought to write, but gives you all the information you need.
18. Let's practice!