Data pipeline architecture patterns

When building data pipelines, it's best to separate the files where functions are being defined from where they are being run.

In this exercise, you'll practice importing components of a pipeline into memory before using these functions to run the pipeline end-to-end. The project takes the following format, where pipeline_utils stores the extract(), transform(), and load() functions that will be used run the pipeline.

> ls
 etl_pipeline.py
 pipeline_utils.py

Bu egzersiz

ETL and ELT in Python

kursunun bir parçasıdır

Kursu Görüntüle

Egzersiz talimatları

Import the extract, transform, and load functions from the pipeline_utils module.
Use the functions imported to run the data pipeline end-to-end.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Import the extract, transform, and load functions from pipeline_utils
____

# Run the pipeline end to end by extracting, transforming and loading the data
raw_tax_data = ____("raw_tax_data.csv")
clean_tax_data = ____(raw_tax_data)
____(clean_tax_data, "clean_tax_data.parquet")

Kodu Düzenle ve Çalıştır