Automate the loading and combining of data from multiple Excel worksheets
You are now ready to automate the import process of listing information from all three exchanges in the Excel file listings.xlsx
by implementing a for loop. Let's look at what you'll do:
- Retrieve the sheet names of a
pd.ExcelFile()
object using itssheet_names
attribute. - Create an empty list.
- Write a for loop that iterates through these sheet names to read the data from the corresponding sheet name in the Excel file into a variable. Add a reference column, if desired. Append the contents of this variable to the list with each iteration.
- Concatenate the DataFrames in the list.
As always, refer to the previous exercises in this chapter or the pandas
documentation if you need any help. pandas
has been imported as pd
.
This exercise is part of the course
Importing and Managing Financial Data in Python
Exercise instructions
- Create the
pd.ExcelFile()
object using the filelistings.xlsx
and assign to the variablexls
. - Retrieve the sheet names from the
.sheet_names
attribute ofxls
and assign toexchanges
. - Create an empty list and assign to the variable
listings
. - Iterate over
exchanges
using a for loop withexchange
as iterator variable. In each iteration:- Use
pd.read_excel()
withxls
as the the data source,exchange
as thesheet_name
argument, and'n/a'
asna_values
to address missing values. Assign the result tolisting
. - Create a new column in
listing
called'Exchange'
with the valueexchange
(the iterator variable). - Append the resulting
listing
DataFrame tolistings
.
- Use
- Use
pd.concat()
to concatenate the contents oflistings
and assign tolisting_data
. - Inspect the contents of
listing_data
using.info()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the pd.ExcelFile() object
xls = ____
# Extract the sheet names from xls
exchanges = ____.____
# Create an empty list: listings
# Import the data
for exchange in exchanges:
listing = pd.____(____, sheet_name=____, na_values='n/a')
listing['Exchange'] = ____
listings.____(____)
# Concatenate the listings: listing_data
listing_data = pd.____(____)
# Inspect the results
listing_data.info()