Course Introduction

1. Course Introduction

Hi, and welcome to the first installment of the Serverless Data Processing with Dataflow series, Dataflow Foundations, My name is Mehran Nazir, and I am a product manager with Dataflow. The Serverless Data Processing with Dataflow Course Series builds on the concepts covered in the Data Engineering specialization. We introduced core Dataflow principles when exploring how to build batch data pipelines on Google Cloud. We also covered streaming basics concepts like windowing, triggers, and watermarks while learning how to build resilient streaming systems using Dataflow. This course series expands on those concepts with three additional courses: Foundations, which will cover the fundamentals of the Apache Beam and Dataflow model. Developing Pipelines, which will provide a comprehensive review of the Apache Beam SDK. And Operations, which will equip learners with the tools to run your Dataflow pipelines at scale. In this course, we will do a deep dive on Foundations. Let’s review the outline for the Dataflow Foundations course. First, we will do a quick refresh on the Apache Beam programming model and Google’s Dataflow managed service. Next, we will learn about the Beam Portability Framework, which allows users to write pipelines in their preferred programming language and run on their desired execution engine. In the next module, we will learn about Dataflow’s premium backends that separate compute and storage for maximum performance. We will then explore how IAM, quotas, and permissions work together to enable Dataflow pipelines. Finally, we will review the main security features that are available with Dataflow and how to implement them. To conclude, we will summarize the main concepts covered in the Foundations course.

2. Let's practice!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.