Examples of the modern data architecture
1. Examples of the modern data architecture
Now that we understand the basics of what modern data architectures are, it's a great time to review some examples of known data architectures that could be used to implement modern data solutions.2. Lambda architecture
Let's start by understanding what Lambda architecture is and how it fits into modern data architectures. Lambda is a data processing architecture designed to handle large-scale, real-time data. It combines batch and stream processing to provide a comprehensive and reliable solution. Lambda consists of three main layers: the batch layer, speed layer, and serving layer.3. Lambda architecture layers: Batch
The batch layer handles the master dataset, which keeps the data immutable, and only appends new records as data keeps coming. To be able to provide value to the company, this layer has to pre-compute a set of views that will allow the data to have some meaning. For instance, if our data is related to bank accounts, the master dataset will contain the whole list of transactions, and we will need to use a view with all the transactions already applied and computed for precise information about the balance at a given point in time. It's important to know that this layer aims for perfect accuracy.4. Lambda architecture layers: Speed
The batch approach sounds good so far. However, batch jobs usually run with daily schedules or, at best, hourly or every few minutes. Thus, there will be a gap between the data known by the batch layer and the current reality. The speed layer aims to close such a gap. This layer will not be as accurate as the batch layer, as its goal is to allow the business to use the most recent data. Eventually, the batch layer will override the possible imprecisions introduced by this layer.5. Lambda architecture layers: Serving
Finally, there's the serving layer. This layer aims to merge batch and real-time views to provide a unified view of the data to support querying and analysis without having the delay of traditional batch-only approaches.6. Complexity of Lambda architecture
Even if Lambda is a great alternative in modern data solutions, dealing with two layers for processing the same data may be really complicated. As this adds complexity around the systems to be maintained, the code bases, and the algorithms because we need to make them work in at least two stacks to support both the batch and streaming layers.7. Kappa architecture
Here is where Kappa architecture comes into play. It is a simplified version of the lambda architecture that eliminates the need for a separate batch layer. However, it's not as simple as saying we do not need a batch layer anymore. Kappa deals with it by treating all the data as a single data stream, so we will be able to ingest and process all our data with the same streaming stack. Still, if we need to perform batch analysis or backfilling, we will depend on reprocessing the whole source of events again.8. Lambda vs. Kappa
Let's summarize the key differences between both architectures. Lambda uses two layers: batch and stream. On the other hand, Kappa eliminates the need for a separate layer by treating all data as a single data stream. Lambda is still useful as we will have more flexibility at the time of ingesting the data and also may produce more complex outputs via the batch layer. However, the overall complexity of maintaining the system is greater than what with Kappa.9. Let's practice!
Now, let's review more about these two architectures!Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.