Get startedGet started for free

Branching

1. Branching

Nice work so far - we're nearing the end of our introduction to Airflow. Let's take a look at our last new concept, branching.

2. Branching

Branching provides the ability for conditional logic within Airflow. Basically, this means that tasks can be selectively executed or skipped depending on the result of an Operator. By default, we're using the BranchPythonOperator. There are other branching operators available and as with everything else in Airflow, you can write your own if needed. This is however outside the scope of this course. We import the BranchPythonOperator from airflow dot operators dot python. The BranchPythonOperator works by running a function (the python underscore callable as with the normal PythonOperator). The function returns the name (or names) of the task_ids to run. This is best seen with an example.

3. Branching example

For our branching example, let's assume we've already defined our DAG and imported all the necessary libraries. Our first task is to create the function used with the python_callable for the BranchPythonOperator. You'll note that the asterisk asterisk kwargs argument is the only component passed in. Remember that this is a reference to a keyword dictionary passed into the function. In the function we first access the ds underscore nodash key from the kwargs dictionary. If you remember from our previous lesson, this is the template variable used to return the date in YYYYMMDD format. We take this value, convert it to an integer, and then run a check if modulus 2 equals 0. Basically, we're checking if a number is fully divisible by 2. If so, it's even, otherwise, it's odd. As such, we return either even underscore day underscore task, or odd underscore day underscore task.

4. Branching example

The next part of our code creates the BranchPythonOperator. This is like the normal PythonOperator, except we pass in the provide underscore context argument and set it to True. This is the component that tells Airflow to provide access to the runtime variables and macros to the function. This is what gets referenced via the kwargs dictionary object in the function definition. Now we don't show the code here, but let's assume we've created two tasks for even days, and two tasks for odd numbered days. We need to set the dependencies using the bitshift operators. First, we configure the dependency order for start task, branch task, then even day task and even day task2. Now we need to set the dependency order for the odd day tasks. As we've already defined the dependency for the start and branch tasks, we can set odd day task to follow the branch task, and the odd day task2 to follow that. You may be wondering why you'd set these dependencies if one set is not going to run. If you didn't set these task dependencies, all the tasks would run as normal, regardless of what the branch operator returned.

5. Branching graph view

Let's look at the DAG in the graph view of the Airflow UI. You'll notice that we have a start task upstream of the branch task. The branch task then shows two paths, one to the odd day tasks, and the other to the even day tasks.

6. Branching even days

Let's look first at what happens if we run on an even numbered day. The start task executes as normal, then the branch task checks the ds_nodash value and determines this is an even day. It returns the value even underscore day underscore task, which is then executed by Airflow followed by the even day task2. Note that the odd day tasks are marked differently, which refers to them being skipped.

7. Branching odd days

For completeness, let's look at the output from a run on an odd day. The process is the same, except that the branch task selects odd day task instead and the even branch is marked skipped.

8. Let's practice!

We're nearly to the end of our course - let's practice working with branches now.