Aan de slagGa gratis aan de slag

Load a portion of a spreadsheet

Spreadsheets meant to be read by people often have multiple tables, e.g., a small business might keep an inventory workbook with tables for different product types on a single sheet. Even tabular data may have header rows of metadata, like the New Developer Survey data here. While the metadata is useful, we don't want it in a dataframe. You'll use read_excel()'s skiprows keyword to get just the data. You'll also create a string to pass to usecols to get only columns AD and AW through BA, about future job goals.

pandas has been imported as pd.

Deze oefening maakt deel uit van de cursus

Streamlined Data Ingestion with pandas

Cursus bekijken

Oefeninstructies

  • Create a single string, col_string, specifying that pandas should load column AD and the range AW through BA.
  • Load fcc_survey_headers.xlsx', setting skiprows and usecols to skip the first two rows of metadata and get only the columns in col_string.
  • View the selected column names in the resulting dataframe.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create string of lettered columns to load
col_string = ____

# Load data with skiprows and usecols set
survey_responses = ____("fcc_survey_headers.xlsx", 
                        ____, 
                        ____)

# View the names of the columns selected
print(survey_responses.columns)
Code bewerken en uitvoeren