1. Working with multiple sheets
Until now, we have only been working with one tab of a spreadsheet.
2. Previously
Recall importing the pandas package, which gives us the functionality to read in and interact with our Excel data in Python via the pd-dot-read-excel function and the resulting pandas DataFrame object.
3. Previously
However, pd-dot-read-excel actually only reads in the first tab of your workbook. So,
4. Two tabs
what if the the fruits' price and color information lived in separate tabs of our workbook? pd-dot-read-excel would not work. It would only read in the first tab here, the price tab.
5. pd.ExcelFile()
Enter pd-dot-excel-file. Capital E, capital F. Now instead of using pd-dot-read-excel to import the first tab of our spreadsheet, pd-dot-excel-file will read in the entire workbook. Here, we assign the file fruit-underscore-tabs-dot-xlsx to the variable fruit-underscore-workbook.
6. pd.ExcelFile()
But when we print fruit-underscore-workbook to the console, you can see we don't have any DataFrames quite yet. Instead we have an ExcelFile object, which in essence is a representation of the entire workbook. The ExcelFile object contains information about each sheet, as well as methods for converting each sheet into a DataFrame. Which brings us to a new, important component of Python objects: Attributes.
7. Attributes
Attributes are like the features of an object. Similar to methods, attributes live within objects, and can be accessed with the dot. Contrary to methods, though, attributes require no parentheses. Take this example:
8. .sheet_names attribute
we read in our entire workbook with the pd-dot-excel-file function, stored as fruit-underscore-workbook. Then
9. .sheet_names attribute
we access our tab, or sheet, names with the sheet-underscore-names attribute of fruit-underscore-workbook, using the dot, and store this as fruit-underscore-sheet-underscore-names. When we print this variable
10. .sheet_names attribute
to the console, the result is the names of each sheet, or tab, as we would expect to see, price and color.
11. Attributes vs. methods
Here's a side-by-side of attributes and methods. We access both from within objects using the dot. With attributes, like the dot-sheet-underscore-names attribute, there are no parentheses. Lastly, attributes generally tell us something about an object. With methods, like dot-parse, which we'll learn about in a minute, we use parentheses, and often times pass an argument to the method, such as the sheet we wish to be converted to a DataFrame. Generally, methods perform some action for us. Note that both
12. Attributes vs. methods
attributes and methods are always attached to an object, and specific to an object type. For instance, ExcelFile objects do not have the same methods and attributes as DataFrame objects.
13. Attributes vs. methods vs. functions
And while functions may appear similar to methods, they generally stand alone or are accessed from a package like pandas. This subtle difference will become more apparent in time as you work with Python more and more.
14. .parse() method
With that, let's revisit our code, and use the dot-parse method on fruit-underscore-workbook to turn the price tab of our workbook into a DataFrame. We do this by placing our tab name, price, in quotes, within the parentheses of the dot-parse method
15. .parse() method
and assigning the result to fruit-underscore-prices.
16. Recap
Let's recap. pd-dot-excel-file is a function that reads our entire Excel workbook, dot-sheet-underscore-names is an attribute that tells us the name of each sheet in the workbook, and dot-parse is a method that turns individual workbook sheets into DataFrames.
17. Your Turn!
Now it's your turn to use these functions, methods, and attributes to work with multiple sheets in Python.