Loading sales data to a CSV file
Loading data is an essential component of any data pipeline. It ensures that any data consumers and processes have reliable access to data that you've extracted and transformed earlier in a pipeline. In this exercise, you'll practice loading transformed sales data to a CSV file using pandas
, which has been imported as pd
. In addition to this, the raw data has been extracted and is available in the DataFrame raw_sales_data
.
This exercise is part of the course
ETL and ELT in Python
Exercise instructions
- Filter the
raw_sales_data
DataFrame to only keep all items with a price less than 25 dollars. - Update the
load()
function to write the transformed sales data to a file named"transformed_sales_data.csv"
, making sure not include theindex
column. - Call the
load()
function on the cleaned Data Frame.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def transform(raw_data):
# Find the items prices less than 25 dollars
return raw_data.loc[raw_data["Price Each"] ____ ____, ["Order ID", "Product", "Price Each", "Order Date"]]
def load(clean_data):
# Write the data to a CSV file without the index column
____.____("transformed_sales_data.csv", index=____)
clean_sales_data = transform(raw_sales_data)
# Call the load function on the cleaned DataFrame
____(____)