Get startedGet started for free

Creating a production pipeline #1

You've learned a lot about how Airflow works - now it's time to implement your workflow into a production pipeline consisting of many objects including sensors and operators. Your boss is interested in seeing this workflow become automated and able to provide SLA reporting as it provides some extra leverage for closing a deal the sales staff is working on. The sales prospect has indicated that once they see updates in an automated fashion, they're willing to sign-up for the indicated data service.

From what you've learned about the process, you know that there is sales data that will be uploaded to the system. Once the data is uploaded, a new file should be created to kick off the full processing, but something isn't working correctly.

Refer to the source code of the DAG to determine if anything extra needs to be added.

This exercise is part of the course

Introduction to Apache Airflow in Python

View Course

Exercise instructions

  • Update the DAG in pipeline.py to import the needed operators.
  • Run the sense_file task from the command line and look for any errors. Use the command airflow tasks test <dag_id> <task_id> <date> and the appropriate arguments to run the command. For the last argument, use a -1 instead of a specific date.
  • Determine why the sense_file task does not complete and remedy this using the editor. Make sure to scroll through the terminal output to find any ERROR messages highlighted in red.
  • Re-test the sense_file task and verify the problem is fixed.

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise