Automate the loading and combining of data from multiple Excel worksheets
You are now ready to automate the import process of listing information from all three exchanges in the Excel file listings.xlsx by implementing a for loop. Let's look at what you'll do:
- Retrieve the sheet names of a
pd.ExcelFile()object using itssheet_namesattribute. - Create an empty list.
- Write a for loop that iterates through these sheet names to read the data from the corresponding sheet name in the Excel file into a variable. Add a reference column, if desired. Append the contents of this variable to the list with each iteration.
- Concatenate the DataFrames in the list.
As always, refer to the previous exercises in this chapter or the pandas documentation if you need any help. pandas has been imported as pd.
Cet exercice fait partie du cours
Importing and Managing Financial Data in Python
Instructions
- Create the
pd.ExcelFile()object using the filelistings.xlsxand assign to the variablexls. - Retrieve the sheet names from the
.sheet_namesattribute ofxlsand assign toexchanges. - Create an empty list and assign to the variable
listings. - Iterate over
exchangesusing a for loop withexchangeas iterator variable. In each iteration:- Use
pd.read_excel()withxlsas the the data source,exchangeas thesheet_nameargument, and'n/a'asna_valuesto address missing values. Assign the result tolisting. - Create a new column in
listingcalled'Exchange'with the valueexchange(the iterator variable). - Append the resulting
listingDataFrame tolistings.
- Use
- Use
pd.concat()to concatenate the contents oflistingsand assign tolisting_data. - Inspect the contents of
listing_datausing.info().
Exercice interactif pratique
Essayez cet exercice en complétant cet exemple de code.
# Create the pd.ExcelFile() object
xls = ____
# Extract the sheet names from xls
exchanges = ____.____
# Create an empty list: listings
# Import the data
for exchange in exchanges:
listing = pd.____(____, sheet_name=____, na_values='n/a')
listing['Exchange'] = ____
listings.____(____)
# Concatenate the listings: listing_data
listing_data = pd.____(____)
# Inspect the results
listing_data.info()