1. Additional operators
Now that we've used the BashOperator and worked with tasks, let's take a look at some more common operators available within Airflow.
2. PythonOperator
The PythonOperator is similar to the BashOperator, except that it runs a Python function or callable method.
Much like the BashOperator, it requires a taskid, a dag entry, and most importantly a python underscore callable argument set to the name of the function in question.
You can also pass arguments or keyword style arguments into the Python callable as needed.
Our first example shows a simple printme function that writes a message to the task logs. We must first import the PythonOperator from the airflow dot operators dot python library. Afterwards, we create our function printme, which will write a quick log message when run. Once defined, we create the PythonOperator instance called python underscore task and add the necessary arguments.
3. Arguments
The PythonOperator supports adding arguments to a given task. This allows you to pass arguments that can then be passed to the Python function assigned to python callable.
The PythonOperator supports both positional and keyword style arguments as options to the task. For this course, we'll focus on using keyword arguments only for the sake of clarity.
To implement keyword arguments with the PythonOperator, we define an argument on the task called op underscore kwargs. This is a dictionary consisting of the named arguments for the intended Python function.
4. op_kwargs example
Let's create a new function called sleep, which takes a length of time argument. It uses this argument to call the time dot sleep method.
Once defined, we create a new task called sleep underscore task, with the taskid, dag, and python callable arguments added as before. This time we'll add our op underscore kwargs dictionary with the length of time variable and the value of 5.
Note that the dictionary key must match the name of the function argument. If the dictionary contains an unexpected key, it will be passed to the Python function and typically cause an unexpected keyword argument error.
5. EmailOperator
There are many other operators available within the Airflow ecosystem. The primary operators are in the airflow dot operators or airflow dot contrib dot operators libraries.
Another useful operator is the EmailOperator, which as expected sends an email from within an Airflow task. It can contain the typical components of an email, including HTML content and attachments.
Note that the Airflow system must be configured with the email server details to successfully send a message.
6. EmailOperator example
A quick example for sending an email would be sending a generated sales report upon completion of a workflow.
We first must import the EmailOperator object from airflow dot operators dot email.
We can then create our EmailOperator instance with the task id, the to, subject, and content fields and a list of any files to attach. Note that in this case we assume the file latest underscore sales dot xlsx was previously generated - later in the course we'll see how to verify that first.
7. Let's practice!
We've looked at a couple of new Airflow operators - let's practice using them in the exercises ahead.