Troubleshooting workflow
1. Troubleshooting workflow
Omar: Hey there. My name is Omar Ismail, a solutions developer at Google Cloud. In this module, we'll learn how to troubleshoot and debug Dataflow Pipelines. We will begin by looking at the general troubleshooting workflow for Dataflow. The general troubleshooting workflow involves two steps: first, checking for errors in the job, and second, looking for anomalies in the Job Metrics tab. Let us begin by looking at how to check for errors in a job. The most common first step is to look in the Dataflow Jobs page and notice the status of the job in question. If the job is in the failed state, we can then click into the job and dive deeper into the root cause. It is also important to note that not all problematic jobs will be in a failed state, it is possible that certain problematic jobs are still in the running state. In the Job Graph view page, the most common place to look for errors is the error notification above the job graph. If the job failed, then you will also see the individual step in the job graph that failed. More detailed error messages can be found by expanding the Log section, as shown below. Click the open icon to view the full logs in Cloud Logging. Cloud logging provides a simple UI to filter and search for logs within the job. The next part of this section involves looking for anomalies in the job using the Job Metrics tab. Data freshness and system latency are good indicators of the performance of a streaming Dataflow job. Increasing data freshness indicates that the Pipeline workers are unable to keep up with the data being ingested into the Pipeline. Increasing system latency indicates that a certain work item within the Pipeline is taking a long time to get processed. For all Dataflow jobs, the CPU utilization graph is a good indicator of the parallelism in a job and can also indicate if a job is CPU-bound. The latter half of the CPU graph shown here is a good example of limited parallelism in a Pipeline where only one or fewer workers have a high CPU utilization with others close to zero.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.