Persisting data to files

Loading data to a final destination is one of the most important steps of a data pipeline. In this exercise, you'll use the transform() function shown below to transform product sales data before loading it to a .csv file. This will give downstream data consumers a better view into total sales across a range of products.

For this exercise, the sales data has been loaded and transformed, and is stored in the clean_sales_data DataFrame. The pandas package has been imported as pd, and the os library is also ready to use!

This exercise is part of the course

ETL and ELT in Python

View Course

Exercise instructions

Update the load() function to write data to the provided path, without headers or an index column.
Check to make sure the file was loaded to the desired file path.
Call the function to load the transformed data to persistent storage.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

def load(clean_data, file_path):
    # Write the data to a file
    clean_data.to_csv(file_path, ____, ____)

    # Check to make sure the file exists
    file_exists = os.____.____(____)
    if not file_exists:
        raise Exception(f"File does NOT exists at path {file_path}")

# Load the transformed data to the provided file path
____(clean_sales_data, "transformed_sales_data.csv")

Edit and Run Code