Get ready to discover how data is collected, processed, and moved using data pipelines. You will explore the qualities of the best data pipelines, and prepare to design and build your own.

Introduction to ETL and ELT Pipelines

Running an ETL Pipeline

ELT in Action

ETL and ELT Pipelines

Building ETL and ELT Pipelines

Building an ETL Pipeline

The "T" in ELT

Extracting, Transforming, and Loading Student Scores Data

Introduction to Data Pipelines

Dive into leveraging pandas to extract, transform, and load data as you build your first data pipelines. Learn how to make your ETL logic reusable, and apply logging and exception handling to your pipelines.

Extracting data from structure sources

Extracting data from parquet files

Pulling data from SQL databases

Building functions to extract data

Transforming data with pandas

Filtering pandas DataFrames

Transforming sales data with pandas

Validating data transformations

Persisting data with pandas

Loading sales data to a CSV file

Customizing a CSV file

Persisting data to files

Monitoring a data pipeline

Logging within a data pipeline

Handling exceptions when loading data

Monitoring and alerting within a data pipeline

Building ETL Pipelines

Supercharge your workflow with advanced data pipelining techniques, such as working with non-tabular data and persisting DataFrames to SQL databases. Discover tooling to tackle advanced transformations with pandas, and uncover best-practices for working with complex data.

Extracting non-tabular data

Ingesting JSON data with pandas

Reading JSON data into memory

Transforming non-tabular data

Iterating over dictionaries

Parsing data from dictionaries

Transforming JSON data

Transforming and cleaning DataFrames

Advanced data transformation with pandas

Filling missing values with pandas

Grouping data with pandas

Applying advanced transformations to DataFrames

Loading data to a SQL database with pandas

Loading data to a Postgres database

Validating data loaded to a Postgres Database

Advanced ETL Techniques

In this final chapter, you’ll create frameworks to validate and test data pipelines before shipping them into production. After you’ve tested your pipeline, you’ll explore techniques to run your data pipeline end-to-end, all while allowing for visibility into pipeline performance.

Manually testing a data pipeline

Testing data pipelines

Validating a data pipeline at "checkpoints"

Testing a data pipeline end-to-end

Unit-testing a data pipeline

Validating a data pipeline with assert

Writing unit tests with pytest

Creating fixtures with pytest

Unit testing a data pipeline with fixtures

Running a data pipeline in production

Orchestration and ETL tools

Data pipeline architecture patterns

Running a data pipeline end-to-end

Congratulations!

Deploying and Maintaining a Data Pipeline

scores.csv

schools_modified.csv

amazon_sales_cleaned_sql.csv

tax_rate_cleaned.csv

Data pipelines are at the foundation of every strong data platform. Building these pipelines is an essential skill for data engineers, who provide incredible value to a business ready to step into a data-driven future. This introductory course will help you hone the skills to build effective, performant, and reliable data pipelines.

<p><h2>Empowering Analytics with Data Pipelines</h2>
Data pipelines are at the foundation of every strong data platform. Building these pipelines is an essential skill for data engineers, who provide incredible value to a business ready to step into a data-driven future. This introductory course will help you hone the skills to build effective, performant, and reliable data pipelines.</p>

<p><h2>Building and Maintaining ETL Solutions</h2>
Throughout this course, you’ll dive into the complete process of building a data pipeline. You’ll grow skills leveraging Python libraries such as <code>pandas</code> and <code>json</code> to extract data from structured and unstructured sources before it’s transformed and persisted for downstream use. Along the way, you’ll develop confidence tools and techniques such as architecture diagrams, unit-tests, and monitoring that will help to set your data pipelines out from the rest. As you progress, you’ll put your new-found skills to the test with hands-on exercises.</p>

<p><h2>Supercharge Data Workflows</h2>
After completing this course, you’ll be ready to design, develop and use data pipelines to supercharge your data workflow in your job, new career, or personal project.</p>


Data Warehousing Concepts

Streamlined Data Ingestion with pandas

Learn to build effective, performant, and reliable data pipelines using Extract, Transform, and Load principles.

ETL and ELT in Python

Data Engineer in Python

Machine Learning Engineer

Likely to Recommend

Extracting data from parquet files

“ETL and ELT in Python”

Exercise instructions

Hands-on interactive exercise

ETL and ELT in Python

Chapter 1: Introduction to Data Pipelines

Chapter 2: Building ETL Pipelines

Chapter 3: Advanced ETL Techniques

Chapter 4: Deploying and Maintaining a Data Pipeline

What is DataCamp?