Dataflow Streaming Engine
1. Dataflow Streaming Engine
person: In this video, we look at the dataflow streaming engine. Just like shuffle component in batch, the streaming engine offloads the window state storage from the persistent disks attached to worker VMs to a back-end service. It also implements an efficient shuffle for streaming cases. Luckily, no code changes are required. Worker nodes continue running your user code and implements data transforms and transparently communicate with a streaming engine to source state. With the dataflow streaming engine, you will have a reduction in consumed CPU, memory, and persistent disk storage resources on the worker VMs. Streaming engine works best with smaller worker machine types like n1-standard-2, and does not require persistent disks beyond a smaller worker boot disk. This leads to a lower resource and quota consumption. With streaming engine, your pipeline will be more responsive to variations to incoming data volume. Finally, you will have improved supportability, since you don't need to redeploy your pipelines to applied service updates. To activate dataflow streaming engine, see dataflow official documentation.2. Let's practice!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.