Running an ETL Pipeline
Ready to run your first ETL pipeline? Let's get to it!
Here, the functions extract()
, transform()
, and load()
have been defined for you. To run this data ETL pipeline, you're going to execute each of these functions. If you're curious, take a peek at what the extract()
function looks like.
def extract(file_name):
print(f"Extracting data from {file_name}")
return pd.read_csv(file_name)
This exercise is part of the course
ETL and ELT in Python
Exercise instructions
- Use the
extract()
function to extract data from theraw_data.csv
file. - Transform the
extracted_data
DataFrame using thetransform()
function. - Finally, load the
transformed_data
DataFrame to thecleaned_data
SQL table.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Extract data from the raw_data.csv file
extracted_data = ____(file_name="raw_data.csv")
# Transform the extracted_data
transformed_data = transform(data_frame=____)
# Load the transformed_data to cleaned_data.csv
____(data_frame=transformed_data, target_table="cleaned_data")