Validating data transformations
Great work so far! Manually spot-checking transformations is a great first step to ensuring that you're maintaining data quality throughout a pipeline. pandas
offers several built-in functions to help you with just that!
To help get you started with this exercise, pandas
has been imported as pd
.
This exercise is part of the course
ETL and ELT in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
def extract(file_path):
# Ingest the data to a DataFrame
raw_data = pd.____(____)
# Return the DataFrame
return raw_data
raw_sales_data = extract("sales_data.parquet")