1. Intro to real-time streaming
Welcome to the third chapter of our discussion on streaming systems. In this lesson, we'll cover the ideas of real-time data processes and how that relates to streaming systems. Let's define what these terms mean now.
2. What is 'real-time'?
When most people hear the term real-time, they tend to think of something as happening immediately. The actual meaning depends a bit on the context in which it's being used, and the process it's being applied to.
Typically, real-time defines a response timeframe that is dependent on the type of process being used.
The response timeframe is treated as a kind of guarantee, meaning that the system is designed to return results within the specified timeframe.
This timeframe could be anything depending on the process. It could be
one day,
one hour,
one minute, or even shorter.
3. Real world example
Let's consider a real-world example to solidify some of these ideas. As we've discussed previously, let's take a look at a general post office.
Post offices tend to offer multiple classes of service, such as first-class mail, priority, or express mail. The offerings vary based on locality, but typically are defined by how quickly the item is expected to arrive.
Consider that often the capacity to deliver an item within a given timeframe goes down the faster the service is (consider if you had 100 slots for next-day service and 500 for 5 day service. What would happen if you received 250 items for next-day service?)
In addition to capacity limits at faster delivery timeframes, the costs also tend to go up as delivery time decreases. This could be due to air vs ground travel, courier based delivery, and so forth.
It's important to note that the service selection is up to the sender and is going to be based on their requirements for speed of delivery, weight, and so forth.
4. Relationship to streaming?
You may be wondering how does the idea behind real-time relate to streaming data processes.
Much like our post office example, streaming data processes are defined by our available resources.
These resources define how quickly our data can be transported,
how quickly it can be processed,
and how quickly it can be delivered to the end users.
This can also help define how much the process costs, based on the resources available to us.
5. Resources define implementation
The most important similarity is that these limitations help define any implementation of streaming processes.
We must consider the speed of transport,
latency in processing,
delivery timeframes,
and any data storage costs when selecting our implementation.
Of course, the cost of the processes is often paramount, but is again based on the requirements (ie, why pay more for faster service if it doesn't matter?)
6. Let's practice!
We've had a quick introduction to the concepts of real-time and how it relates to streaming data. Let's test your understanding in the exercises ahead.