1. Introduction to Apache Kafka
Hello, welcome to the introductory course to Apache Kafka. We have a lot to cover - let's get started!
2. About me
My name is Mike, I'm a data engineering consultant and have over 18 years in the data space. I've worked with many tools over my career, such as Apache Spark, Airflow, and of course Kafka.
3. What is Apache Kafka?
Our first question is what is Apache Kafka?
Kafka is an open-source, distributed, event streaming platform. We'll discuss the meaning of these further shortly.
Kafka is designed to handle large quantities of data, and to be scalable to various situations.
4. Event streaming in Kafka
As mentioned, Kafka is an event streaming platform. The meaning of event streaming varies based on context. In Kafka, it means obtaining information from various sources, reliably storing that data, and then providing it to any users or systems needing access.
5. Common uses
Kafka has many uses and can fit into many different scenarios. One is ecommerce, allowing different systems to monitor sales data in up to real-time timeframes.
Kafka works well with various types of order tracking, maintaining the status of various orders, such as new orders, processing and packing, and shipment and delivery.
Building on order tracking, Kafka is often used by ride-share or food delivery services, adding geographical location information and providing live status information for all drivers, riders, and customers.
Kafka works well for sensor data networks, such as closely monitoring temperature data in a warehouse or various information from the safety systems in a car (or even a roller coaster).
While there are certainly many more options, Kafka has found many uses in the cybersecurity realm, including spam tracking, patch monitoring, and more.
6. Kafka components
Let's discuss the main components of Kafka, as we will be working with them primarily throughout this course.
7. Kafka components
The first component in Kafka is the topic. A topic is a common message type stored within Kafka. We'll discuss more about how topics work later, but consider this like a notepad where Kafka stores the events or messages it receives. There can be any number of topics within a Kafka system.
8. Kafka components
The next component is the Kafka producer, which writes events to various topics. Producers can write to a single or multiple topics.
9. Kafka components
The last major component is the Kafka consumer, which reads information from topics. There can be any number of consumers.
10. Kafka components
Here we can see one producer writing to topic 1 and the other producer writing to topic 2.
11. Kafka components
We then see two consumers reading from topic 1 and another consumer reading from topic 2.
12. Let's practice!
We'll cover more about these components later in the course, but let's practice what we've learned in the exercises ahead.