Validating data transformations
Great work so far! Manually spot-checking transformations is a great first step to ensuring that you're maintaining data quality throughout a pipeline. pandas offers several built-in functions to help you with just that!
To help get you started with this exercise, pandas has been imported as pd.
This exercise is part of the course
ETL and ELT in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def extract(file_path):
# Ingest the data to a DataFrame
raw_data = pd.____(____)
# Return the DataFrame
return raw_data
raw_sales_data = extract("sales_data.parquet")