Get startedGet started for free

Filtering

1. Filtering

When analyzing DataFrames, we often want to exclude rows from our analysis.

2. Example

Perhaps after collecting our run data, we want to focus only on runs we did on a Monday. After the weekend, we are more likely to be well-rested, so these should be our best runs. This desire to look at specific parts of the data is where filtering comes in.

3. The filter function

To filter a DataFrame, we can use the "filter" function. In the position of the first argument, we need to construct a filter based on the rows of the DataFrame. The second argument is the DataFrame. To construct the filter, we write row followed by an arrow composed of the minus and greater-than symbols. We can access a value in the row using row followed by dot and the column name. In this case, we access the value in the day column. We write a comparison that checks whether the value in each row is equal to the string "Monday" using the equals-equals operator. If this comparison returns true, then the row is kept. If it returns false and they are not equal, the row is filtered out of the new DataFrame.

4. The filter function

When we print the result, we see that only the rows where the day equals "Monday" are kept. In this example, we compared strings to decide whether to keep the row, but we can filter a DataFrame in many different ways.

5. Filtering on numerical columns

In this example, the filter checks whether the value in the distance column of the row is less than or equal to three thousand. When we filter using this function, we select only our shorter runs.

6. Filtering on boolean columns

Perhaps we want to asses whether the rain conditions affected our runs. We could use this filter to select only the rows where the value in the raining column is true. In this case, we do not need to use a comparison; the value in the raining column is already true or false.

7. Filtering on all comparisons

We can use any comparison to construct these filters using any column of the DataFrame. Here is a summary of all the comparisons we met in chapter three and how we might use them in a filter.

8. Further analysis

Once we have filtered the DataFrame, we can perform further analysis on it as before. This example tells us how far we've ran in the rain.

9. Let's practice!

Let's practice some filtering in the final exercises.

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.