Get startedGet started for free

Comprehensions

1. Comprehensions

2. Comprehensions are loops

One of the most common tasks you will perform when cleaning or summarizing data is to iterate through a list, apply some function or perform some calculation on each element, and then return the results as a new list.

3. List comprehension

For example, if you have a list of numbers and want to create a new list containing the square of each element, you would first create an empty list, then iterate through each element in the list using a for loop and then append the calculated values to the new list. We will now rewrite this for loop in a concise way using a list comprehension in Python. You open a square bracket and write the body of your for loop first, that is, x (the temporary variable) raised to 2. Then you write the for statement that iterates over data, that is, for x in data and close the square bracket. Notice that there is no colon at the end of the variable. This produces the exact same result as before.

4. Dictionary comprehension

You can also create dictionary comprehensions. These are very similar to list comprehensions, the difference is that the result will be a dictionary instead of a list. Because you are creating a dictionary, you will need to create a key-value pair. You begin a dictionary comprehension with a curly bracket and then specify a key-value pair in the form of key:value. In x:x**2, x is your key and x**2 is the value. You then write the for statement that iterates over data, that is for x in data and close the curly bracket.

5. Alternatives to `for` loop

In addition to writing for loops that iterate over data, both R and Python provide functions that easily do this for you in a more concise way. In R, you use the *apply variants like apply(), lapply(), sapply() etc. to efficiently loop over matrices, lists, and data frames. In Python, you can use the map() function and the apply method to do this. We will cover the map function now but talk about the apply method in the next chapter.

6. Map

The sq() function here takes a parameter x, and returns the square of x. To square each element in data, you can write a for loop as shown here, or you can use the map() function instead. The map() function takes the name of the function as its first argument, and then a list of values as the second argument. The specified function is then applied to all the elements in the second argument one-at-a-time. This behavior is the same as the sapply() and lapply() functions in R. The output of map() is a map() object. If you want to see the results from applying map(), you need to convert the output of map() into a list using the list() function.

7. Let's practice!

Now it's your turn to practice this!