Backfilling historical data
Late-arriving data is a fact of life. Your daily_sales_load Dag has been running fine, but the order data for April 20-22 only landed in raw_orders today, leaving three holes in daily_summary. The fix is a backfill. The Dag is already active, so you'll see a scheduled run for today alongside the backfill runs you create.
- Run
airflow backfill create --dag-id daily_sales_load --from-date 2026-04-20 --to-date 2026-04-22 --max-active-runs 1in the terminal. - Wait about 30 seconds for the backfill runs to complete.
- Run
airflow dags list-runs daily_sales_loadto see all runs.
Look at the run_id column in the output. What prefix do the backfill runs start with?
This exercise is part of the course
Building Data Pipelines with Airflow
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
Start Exercise