Get startedGet started for free

Timer API

1. Timer API

person: In addition to the state API, you can also use timers in your stateful transformations. The combination of both state and timers enables you to get rich and complex stateful transformations. Let's see how state and timers work together. Timers are used in combination with state variables to ensure that the state is clear at regular intervals of time. As with the case of windows and triggers, we can define timers either in event time or processing time. Event time timers depend on the watermark value. Processing time timers depend on the clock of the workers, and not on any timestamp included in the messages that are being processed. With a timer, we can solve the problem of a state being kept indefinitely. When the DoFn is processing the last messages of the last bundle, it is likely that the last buffer will not reach the threshold value set in the code. Also, if messages are coming in slowly, it may take a long time for the buffer to fill up. In both situations, you probably will want to produce some results rather than wait for a long time until more messages show up. Timers are useful for that. Let's revisit our example, but now let's add a timer to avoid having the state wait indefinitely. This example is in Python. The logic for the state is the same as in the previous examples, but now the DoFn has also an event time timer. When the watermark processes the value of allowed lateness, the timer expires. Then the state will be processed and clear, even if the buffer has not reached the limit size yet. By using a timer, you ensure that your state is not kept indefinitely and that the DoFn will finish even if no new messages are coming. Without the timer, the DoFn will have to be waiting indefinitely until the size of the buffer reaches the limit. Note that you need to have expiry method and annotated with on timer. This is the method that is called when the timer expires. This is the same example, but in Java. You need to use the timer ID annotation to create a timer, and the on timer annotation for the method that is called when the timer expires. The logic is the same as in the Python example. The timer expiry method ensures that the bundles are processed, even if the count does not reach the minimum count size or if the messages are coming in very slowly. In summary, this is that DoFn implemented with the examples. In addition to the two state variables, the DoFn has now a timer, which is called when the watermark goes over a certain value of allowed lateness. This is the typical pattern combining a state variable and timers. Remember that you have two options for timers: event time and processing time. You should use event time timers when you want to do the callback based on the watermark and the timestamps of your data, that is, when data is a state. Event time timers will be influenced by the rate of progress of the input data. Processing time timers will expire regardless of the progress of your data. The timer will trigger at regular intervals. Either event or processing time timers can be used for the example shown in the previous slides. You need to decide whether you want to fire always at regular intervals or depending on the pace of progress of your data and use the timers accordingly.

2. Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.