Get startedGet started for free

Filtering rows and creating columns

1. Filtering rows and creating columns

Often times, you may want to filter your data for a specific observation or set of observations.

2. Filtering in spreadsheets

In spreadsheets, you are most likely familiar with the filtering functionality, where a drop-down menu allows you to reduce your data by ticking a box. How do we recreate this functionality in Python?

3. Accessing a single column

Recall our fruit DataFrame, with name, color, and price columns.

4. Accessing a single column

If we wanted to access just the name column, we would put brackets next to fruit, and place 'name' in quotations within those brackets. The result is a pandas Series object, which you can think of as just the contents of your column.

5. Comparison operators

We can then use comparison operators, like "equal to" or "not equal to", to get logical True/False values for each entry in that column.

6. Comparisons

For example, on the left is our name column, and on the right are logical, or Boolean, True/False values that correspond to where the name column is equal to "Apple". Here, only the first entry is True, since name is equal to Apple only in the first row. Always remember that Python is case-sensitive, so that capital A in Apple is very important.

7. Filtering

To filter, we first reference our DataFrame, fruit, then, inside a set of brackets, we place our comparison. The result is a DataFrame that only contains rows where the comparison is True, in this case, where name is equal to Apple.

8. Filtering

Or here, where we change our comparison to be where the price column is greater than one dollar. The result is a DataFrame where all entries have a price greater than one dollar. Notice how when we filter, the index does not remain sequential or start at 0.

9. Filtering

In the exercises, you might see code like this, where the reset-underscore-index method is tacked on to the end of your comparison. Note how the index is now sequential and starting at 0.

10. Basic filtering pattern

Think of this basic pattern as "show me my DataFrame where this column is equal to that value".

11. Basic filtering pattern

Here is a look at what filtering in Python looks like, and what its equivalent is in a spreadsheet. Both achieve the same result, the Apple row of our fruit data. Note it is possible to filter on more than one condition at once, but that is beyond the scope of this course.

12. Creating a new column

Shifting gears, what about if we wanted to create a new column? What if we bought two of each fruit? How could we make a cost column?

13. Creating a new column

In a spreadsheet, the process would look something like this. Take each price cell and multiply by 2. Then drag the formula all the way down to the bottom of the data.

14. Mathematical operators

Fortunately, we have the same mathematical operators at our disposal in Python. So if we buy 2 of each fruit, we still multiply by 2, using the asterisk.

15. Creating a new column

To add the cost column to our DataFrame, we simply define a cost column in fruit and designate its value as 2 times the price column.

16. Creating a new column

What if our DataFrame had a quantity column that contained the quantity of each fruit purchased?

17. Creating a new column

In a spreadsheet, we would multiply our price column by the quantity column.

18. Creating a new column

With our DataFrame, it's actually not too different. Here is the result. In our code, on the left of the equals sign, we've defined our new column, cost, and on the right of the equals sign, we've multiplied the price column by the quantity column.

19. Your turn!

Now it's your turn to manipulate the movie theater sales data. Good luck!