Exercise

Automate the loading and combining of data from multiple Excel worksheets

You are now ready to automate the import process of listing information from all three exchanges in the Excel file listings.xlsx by implementing a for loop. Let's look at what you'll do:

  1. Retrieve the sheet names of a pd.ExcelFile() object using its sheet_names attribute.
  2. Create an empty list.
  3. Write a for loop that iterates through these sheet names to read the data from the corresponding sheet name in the Excel file into a variable. Add a reference column, if desired. Append the contents of this variable to the list with each iteration.
  4. Concatenate the DataFrames in the list.

As always, refer to the previous exercises in this chapter or the pandas documentation if you need any help. pandas has been imported as pd.

Instructions

100 XP
  • Create the pd.ExcelFile() object using the file listings.xlsx and assign to the variable xls.
  • Retrieve the sheet names from the .sheet_names attribute of xls and assign to exchanges.
  • Create an empty list and assign to the variable listings.
  • Iterate over exchanges using a for loop with exchange as iterator variable. In each iteration:
    • Use pd.read_excel() with xls as the the data source, exchange as the sheet_name argument, and 'n/a' as na_values to address missing values. Assign the result to listing.
    • Create a new column in listing called 'Exchange' with the value exchange (the iterator variable).
    • Append the resulting listing DataFrame to listings.
  • Use pd.concat() to concatenate the contents of listings and assign to listing_data.
  • Inspect the contents of listing_data using .info().