Transforming sales data with pandas
Before insights can be extracted from a dataset, column types may need to be altered to properly leverage the data. This is especially common with temporal data types, which can be stored in several different ways.
For this example, pandas
has been import as pd
and is ready for you to use.
This exercise is part of the course
ETL and ELT in Python
Exercise instructions
- Update the
transform()
function to convert data in the"Order Date"
column to typedatetime
. - Filter the DataFrame to only contain rows with
"Price Each"
less than ten dollars. - Print the data types of each column in the DataFrame.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
raw_sales_data = extract("sales_data.csv")
def transform(raw_data):
# Convert the "Order Date" column to type datetime
raw_data["Order Date"] = pd.____(____, format="%m/%d/%y %H:%M")
# Only keep items under ten dollars
clean_data = raw_data.loc[____, :]
return clean_data
clean_sales_data = transform(raw_sales_data)
# Check the data types of each column
print(____)