Creating a production pipeline #1
You've learned a lot about how Airflow works - now it's time to implement your workflow into a production pipeline consisting of many objects including sensors and operators. Your boss is interested in seeing this workflow become automated and able to provide SLA reporting as it provides some extra leverage for closing a deal the sales staff is working on. The sales prospect has indicated that once they see updates in an automated fashion, they're willing to sign-up for the indicated data service.
From what you've learned about the process, you know that there is sales data that will be uploaded to the system. Once the data is uploaded, a new file should be created to kick off the full processing, but something isn't working correctly.
Refer to the source code of the DAG to determine if anything extra needs to be added.
This exercise is part of the course
Introduction to Apache Airflow in Python
Exercise instructions
- Update the DAG in
pipeline.py
to import the needed operators. - Run the
sense_file
task from the command line and look for any errors. Use the commandairflow tasks test <dag_id> <task_id> <date>
and the appropriate arguments to run the command. For the last argument, use a-1
instead of a specific date. - Determine why the
sense_file
task does not complete and remedy this using the editor. Make sure to scroll through the terminal output to find any ERROR messages highlighted in red. - Re-test the
sense_file
task and verify the problem is fixed.
Hands-on interactive exercise
Turn theory into action with one of our interactive exercises
