Load a portion of a spreadsheet
Spreadsheets meant to be read by people often have multiple tables, e.g., a small business might keep an inventory workbook with tables for different product types on a single sheet. Even tabular data may have header rows of metadata, like the New Developer Survey data here. While the metadata is useful, we don't want it in a dataframe. You'll use read_excel()'s skiprows keyword to get just the data. You'll also create a string to pass to usecols to get only columns AD and AW through BA, about future job goals.
pandas has been imported as pd.
Este exercício faz parte do curso
Streamlined Data Ingestion with pandas
Instruções do exercício
- Create a single string,
col_string, specifying thatpandasshould load columnADand the rangeAWthroughBA. - Load
fcc_survey_headers.xlsx', settingskiprowsandusecolsto skip the first two rows of metadata and get only the columns incol_string. - View the selected column names in the resulting dataframe.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create string of lettered columns to load
col_string = ____
# Load data with skiprows and usecols set
survey_responses = ____("fcc_survey_headers.xlsx",
____,
____)
# View the names of the columns selected
print(survey_responses.columns)