Hayden CordeiroKafka was originally built by LinkedIn and later made open source under the Apache Software...
Kafka was originally built by LinkedIn and later made open source under the Apache Software Foundation.
At a high level, the flow looks like this:
Producers -----> Brokers (Topics/Partitions) -----> Consumer Groups (Consumers)
These are client applications that produce (write) data to the system.
Brokers are daemons (background processes) that run on hardware or virtual machines.
Topics are logical collections of messages (e.g., orders, customers, clicks).
A topic can have one or more partitions. These partitions are distributed across the brokers.
Example Scenarios:
Scenario A: 1 Topic (Orders), 2 Brokers, 2 Partitions.
Orders-Partition-0
Orders-Partition-1
Scenario B: 1 Topic, 3 Brokers, 2 Partitions.
Orders-Partition-0
Orders-Partition-1
Scenario C: 1 Topic, 1 Broker, 2 Partitions.
Orders-Partition-0 and Orders-Partition-1
Important Note: Every message is written to a specific partition of a TOPIC. Each message gets a unique Offset ID. Kafka guarantees ordering only within a partition, not across the entire topic.
Consumers are the applications reading data from Kafka.
The Scaling Rule:
If you have more Consumers than Partitions, the extra consumers will sit idle.
Offset Management:
Consumers track the last message they read. If a consumer crashes, the group rebalances. A new consumer picks up the partition and resumes from the last committed Offset (the unique ID mentioned earlier).
Understanding the "happy path" is great, but interviews often focus on failure.
Consumer Failure:
Straightforward. The new consumer looks up the last committed offset and continues reading.
Broker Failure:
Since messages are stored on disk inside the broker, what happens if the broker dies? Do we lose data?
No. This is where the Replication Factor comes in.
Kafka replicates partitions across different brokers.
Orders Topic, 2 Brokers, Replication Factor of 2.
Partition 0 (Leader), Partition 1 (Replica)
Partition 1 (Leader), Partition 0 (Replica)
If Broker 2 crashes, Broker 1 typically takes over as the Leader for Partition 1.
When does replication happen?
When a message is written to the Leader partition, it is immediately relayed to the follower replicas.
The producer sends a message to the leader partition. The leader writes it, replicates it, and sends an acknowledgment (ACK) back. You can configure how strict this is:
acks=0 (Fire and Forget): Producer sends data and doesn't wait for a response. Fastest, but risk of data loss.acks=1 (Leader Ack): Producer waits for the Leader to confirm it wrote the message.acks=all (Strongest): Producer waits for the Leader AND all in-sync replicas to confirm. Safest, but highest latency.A common misconception is that Kafka deletes messages individually (e.g., as soon as a new message arrives, an old one is deleted).
How do followers stay in sync with the leader?
FetchRequest, it does two things:
acks=all).