Automate the loading and combining of data from multiple Excel worksheets
You are now ready to automate the import process of listing information from all three exchanges in the Excel file listings.xlsx by implementing a for loop. Let's look at what you'll do:
- Retrieve the sheet names of a
pd.ExcelFile()object using itssheet_namesattribute. - Create an empty list.
- Write a for loop that iterates through these sheet names to read the data from the corresponding sheet name in the Excel file into a variable. Add a reference column, if desired. Append the contents of this variable to the list with each iteration.
- Concatenate the DataFrames in the list.
As always, refer to the previous exercises in this chapter or the pandas documentation if you need any help. pandas has been imported as pd.
Este exercício faz parte do curso
Importing and Managing Financial Data in Python
Instruções do exercício
- Create the
pd.ExcelFile()object using the filelistings.xlsxand assign to the variablexls. - Retrieve the sheet names from the
.sheet_namesattribute ofxlsand assign toexchanges. - Create an empty list and assign to the variable
listings. - Iterate over
exchangesusing a for loop withexchangeas iterator variable. In each iteration:- Use
pd.read_excel()withxlsas the the data source,exchangeas thesheet_nameargument, and'n/a'asna_valuesto address missing values. Assign the result tolisting. - Create a new column in
listingcalled'Exchange'with the valueexchange(the iterator variable). - Append the resulting
listingDataFrame tolistings.
- Use
- Use
pd.concat()to concatenate the contents oflistingsand assign tolisting_data. - Inspect the contents of
listing_datausing.info().
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create the pd.ExcelFile() object
xls = ____
# Extract the sheet names from xls
exchanges = ____.____
# Create an empty list: listings
# Import the data
for exchange in exchanges:
listing = pd.____(____, sheet_name=____, na_values='n/a')
listing['Exchange'] = ____
listings.____(____)
# Concatenate the listings: listing_data
listing_data = pd.____(____)
# Inspect the results
listing_data.info()