1. Horizontally scaling streaming systems
Nice work with the previous exercises! Let's take a look at another option available to improve our system's performance, horizontal scaling.
2. Horizontal scaling refresher
Before diving into streaming systems, let's remind ourselves about horizontal scaling.
While vertical scaling typically means scaling up, horizontal scaling refers to scaling out.
This typically means adding processing capability by adding more systems, rather than making the systems you have faster or better.
This type of scaling works best with embarrassingly parallel situations,
primarily meaning tasks that could be easily split among workers.
One such example could be resizing a group of non-interdependent images.
3. Horizontal scaling with streaming
Unlike purely batching / queuing systems, streaming data processing typically has minimal delays during the pipeline.
This can make transfer of data between workers tricky. If we needed to coordinate communication between systems with multiple processing components (at the same level), this would likely delay our pipeline. Anytime we need to coordinate systems, there is potential state handling that must occur. This adds delays, which we're trying to avoid. Think back to our previous lessons on guarantees and SLAs.
This means it's usually best to process a full stream within a single pipeline (remember, with streaming systems things work best when we can avoid extra delays).
As such, a great strategy to horizontally scale a streaming process is to create copies of the pipeline. Let's consider how this would work.
4. Pipeline copies
A simplified view of horizontally scaling streaming system is as events occur, they initially enter a pipeline or process. Note that events here mean whatever data task we're interested in.
All tasks related to that process are then self-contained within the pipeline until they are completed.
In order to scale our processing, we add more pipelines. Think about the number of checkout lanes as an option to improve processing capacity.
There is also nothing to prevent us from vertically scaling within the pipeline. We can still improve the speed of the processing for that pipeline. The faster we complete that task, the sooner it's available for another event.
5. Additional considerations
In addition to making copies of our pipelines, there are potentially other components that may be required to make our whole process work.
This can include a load balancer or director component. Think of this as something that determines initially where to send events / requests / customers based on some algorithm. It could simply be a round-robin process (one task sent to each worker in order, like a card dealer), or it could be more sophisticated based on the least-busy nodes.
We will also eventually hit bottlenecks in our processes regardless of what we do. Consider a logging situation where the data must be written out to disk, especially with limited disk IO. For assorted technical reasons, we may not be able to handle beyond a certain number of events at that point. This limitation may have a workaround, or it may be an inherent limit. The details for our study don't matter as much as being aware that they likely exist.
One common method to deal with bottlenecks is to shorten our streaming pipeline. If we can remove the need to immediately process data and combine with a batch or queued component, we can often work around some of these needs.
6. Let's practice!
We've learned about horizontally scaling our streaming systems. Let's practice what you've learned now.