Skip to main content

Apache Kafka Knowledge Base

A comprehensive guide to mastering Apache Kafka โ€” from core concepts to production-grade patterns, with Java/Spring Boot examples and interview prep.

What is Apache Kafka?โ€‹

Apache Kafka is a distributed event streaming platform designed for high-throughput, fault-tolerant, and scalable real-time data pipelines and streaming applications.

Originally developed at LinkedIn and open-sourced in 2011, Kafka is now maintained by the Apache Software Foundation and is the backbone of event-driven architectures at thousands of companies worldwide.


Why Kafka?โ€‹

FeatureDescription
High ThroughputMillions of messages/sec per broker
Low LatencySub-millisecond to single-digit ms
DurabilityPersisted to disk, replicated across brokers
ScalabilityHorizontally scalable via partitions
Fault ToleranceLeader election, ISR replication
ReplayabilityConsumers can re-read past messages

How to Use This Knowledge Baseโ€‹

Core Concepts โ†’ Start here if you're new to Kafka
Producer โ†’ Deep dive into producing messages
Consumer โ†’ Deep dive into consuming messages
Advanced Topics โ†’ Streams, Connect, EOS, ordering
Interview Prep โ†’ Curated Q&A to ace Kafka interviews

Quick-Start with Spring Bootโ€‹

Add the dependency:

<dependency>
<groupId>org.springframework.kafka</groupId>
<artifactId>spring-kafka</artifactId>
</dependency>

Minimal application.yml:

spring:
kafka:
bootstrap-servers: localhost:9092
producer:
key-serializer: org.apache.kafka.common.serialization.StringSerializer
value-serializer: org.apache.kafka.common.serialization.StringSerializer
consumer:
group-id: my-group
key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
value-deserializer: org.apache.kafka.common.serialization.StringDeserializer
auto-offset-reset: earliest

Send a message:

@Service
@RequiredArgsConstructor
public class OrderService {

private final KafkaTemplate<String, String> kafkaTemplate;

public void publishOrder(String orderId, String payload) {
kafkaTemplate.send("orders", orderId, payload);
}
}

Consume a message:

@Component
public class OrderConsumer {

@KafkaListener(topics = "orders", groupId = "order-group")
public void consume(String message, @Header(KafkaHeaders.RECEIVED_PARTITION) int partition) {
System.out.printf("Received from partition %d: %s%n", partition, message);
}
}

Prerequisitesโ€‹

  • Java 17+
  • Docker (for local Kafka via docker-compose)
  • Basic understanding of publish-subscribe messaging

Get started

Head to Core Concepts โ†’ Kafka Overview to begin your journey.


Interview Questionsโ€‹

Q: When should Kafka be chosen over a traditional message queue?โ€‹

A: Choose Kafka for high-throughput event streams, replayability, and long retention; use classic queues for simpler point-to-point workflows.

Q: What is the most important production trade-off in Kafka design?โ€‹

A: Balancing durability and latency via replication factor, acks, and batching settings.

Q: How do you avoid hot partitions?โ€‹

A: Use balanced partition keys and validate key cardinality against traffic distribution.

Q: Why does consumer group design matter for scaling?โ€‹

A: Throughput scales by partition count and consumer parallelism constraints; misalignment causes idle consumers or lag.

Q: What reliability controls should be discussed in a senior interview answer?โ€‹

A: Idempotent producers, retries with backoff, dead-letter handling, and observability of lag and rebalance behavior.

Q: How do you explain eventual consistency with Kafka to product stakeholders?โ€‹

A: Events are processed asynchronously with bounded delay; systems converge to correctness while gaining resilience and scale.