Running SQL workloads
1. Running SQL workloads
In this chapter, we'll put everything together by building production-ready SQL pipelines. Let's start with Airflow's SQL operators.2. Why SQL in Airflow?
SQL workloads are the most common Airflow use case. We can use them to aggregate sales, join records, or load warehouse tables. Instead of pulling data into Python, transforming it, and pushing it back, let the database do the heavy lifting. Airflow orchestrates when queries run, in what order, and on which connection. The database handles the actual computation.3. SQLExecuteQueryOperator
The SQLExecuteQueryOperator is Airflow's universal SQL runner. We import it from the common SQL provider package. It works with any database that has a compatible Airflow provider: PostgreSQL, Snowflake, BigQuery, DuckDB, and dozens more. The operator takes two key parameters: sql, with the query to execute, and conn_id, with the name of the database connection.4. Connections
Connections store database credentials outside your Dag code. Each connection has an ID, a type, and the details needed to connect: host, port, login, and password. Connections can be created in different ways: through the Airflow UI, the CLI, the API, or even as environment variables. In your code, just reference the conn_id string. The operator looks up the connection at runtime and manages the session. Credentials never appear in your Dag files.5. Building a SQL pipeline
Let's walk through the code for our first SQL pipeline. We create a SQLExecuteQueryOperator with the task_id aggregate_daily_sales, pointing to our DuckDB connection. The inline SQL is an INSERT INTO SELECT that aggregates raw orders by date into a daily_summary table, counting orders and summing revenue. We can then read the results back with a second operator, and a @task function prints them so we can verify the output.6. External SQL files
Inline SQL works for short queries, but as queries grow, mixing SQL with Python makes code reviews harder. The standard practice is to store SQL in separate files. On the right, you can see a dedicated SQL folder under include that holds all our queries, including aggregate_sales. On the left, the pipeline simply references that filename. To make this work, set template_searchpath on the dag decorator to the SQL folder path, then set sql to just the filename. Airflow finds the file, renders any Jinja templates inside, and executes the result. This separates business logic from orchestration logic, keeps the dags folder lightweight, and makes SQL reviewable and version-controlled.7. Jinja templates in SQL files
Inside SQL files, we use the same Jinja templates we already know. Double curly braces ds render the logical date in the YYYY-MM-DD format. Here we DELETE existing rows for a date first, then INSERT fresh ones. This is the delete-then-insert pattern from Chapter 2, now applied to SQL files. If the pipeline re-runs for the same date, the delete clears old results and the insert writes them fresh. This ensures the same input produces the same output every time.8. params vs parameters
SQLExecuteQueryOperator has two similarly named options that work very differently. On the left, params uses Jinja rendering. The value gets interpolated directly into the SQL string before execution. If user_input contains malicious SQL, it runs as part of your query. On the right, parameters use database-level binding. Values are passed to the database driver separately, which handles escaping automatically. For untrusted input, always use parameters.9. From local to production with Astro
As your SQL pipelines mature, you need a path from development to production. Astronomer's Astro platform is a managed Airflow environment built for exactly that. It comes with production features like auto-scaling, disaster recovery, remote execution, and high availability. The Astro CLI lets you spin up a complete local Airflow environment with one command, test your Dags, and ship them to Astro just as easily.10. Let's practice!
Let's write some SQL pipelines.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.