39 docs tagged with "kafka"

Change Data Capture (CDC)

Comprehensive guide on Change Data Capture (CDC), detailing how it works, alternatives comparison, implementation patterns with Debezium and Spring, and deep dives for senior engineers.

Consumer Groups

A **consumer group** is a set of consumers that collectively consume a topic's partitions. Each partition is assigned to exactly one consumer within the group.

Consumer Lag

Consumer Lag measures how far behind a consumer group is from the latest messages in a topic. It is the most critical health metric for any Kafka-based application.

Dead Letter Queue (DLQ) Pattern

A comprehensive guide to the Dead Letter Queue (DLQ) pattern — covering poison pill handling, retry strategies, alternatives comparison, AWS SQS / Kafka / RabbitMQ implementations, and production deep dives for senior engineers.

Deduplication in Distributed Messaging — Kafka, Kafka Streams, RabbitMQ, SQS, Redis

A comprehensive guide to preventing duplicate message processing across Kafka, Kafka Streams, RabbitMQ, SQS, and Redis — covering EOS internals, idempotent consumers, and production deduplication patterns for senior engineers.

Deployment Configuration & Infrastructure Verification

A comprehensive guide to managing and verifying application configurations, environment variables priority, HashiCorp Vault secrets, Kafka topics, ACLs, and schema registry compatibility at deploy time.

Hash Key Partitions

Kafka uses a hash of the message key to determine partition assignment. Understanding this mechanism is essential for ordering guarantees, avoiding hot partitions, and designing correct partition keys.

Idempotent Producer

Without idempotence, the standard retry flow can produce **duplicates**:

Interview Questions — Advanced Topics

**Q1: What are the three layers required for end-to-end exactly-once in Kafka?**

Interview Questions — Core Concepts

**Q1: Explain Kafka's architecture in 2 minutes.**

Interview Questions — Producer & Consumer

**Q1: Walk me through what happens when a producer calls `send()`.**

Kafka Architecture Overview

Producers ──► [ Broker Cluster ] ──► Consumers │ │ │ B1 B2 B3 │ ZooKeeper / KRaft

Kafka Broker — Complete Guide

A complete guide to Kafka brokers — what they are, how storage works, partition leadership, replication, ISR, KRaft vs ZooKeeper, log compaction, performance internals, and production monitoring. Beginner through senior depth.

Kafka Connect

**Kafka Connect** is a framework for **reliably moving data between Kafka and external systems** (databases, file systems, cloud services) without writing.

Kafka Consumer

A **consumer** reads messages from Kafka topics. Unlike traditional queues (push-based), Kafka consumers **pull** messages at their own pace. This gives.

Kafka Exactly-Once Semantics (EOS)

A complete guide to Kafka exactly-once semantics — delivery guarantees, idempotent producer, transactions, read_committed consumers, Kafka Streams EOS, zombie producer fencing, two-phase commit internals, and production patterns. Beginner through senior depth.

Kafka Knowledge Base

Apache Kafka is a **distributed event streaming platform** designed for high-throughput, fault-tolerant, and scalable real-time data pipelines and streaming.

Kafka Producer

A **producer** is a client application that publishes (writes) messages to Kafka topics. It is responsible for:

Kafka Streams — Complete Deep Dive

A comprehensive guide to Kafka Streams: from core concepts and internal architecture to stateful processing, failure recovery, and production system design patterns. Built for new learners and senior engineers alike.

Kafka Throughput Optimization

A deep-dive into techniques for improving Kafka throughput — covering compression, batching, partitions, consumer parallelism, tuning configs, and their trade-offs.

Kafka Topics

A **topic** is a named, durable stream of messages in Kafka. Think of it as a logical category or feed where producers write and consumers read.

KRaft vs ZooKeeper: Kafka Metadata Architecture

A comprehensive guide comparing Apache Kafka's legacy ZooKeeper architecture with the modern KRaft (Kafka Raft) metadata mode — covering internal mechanics, failure scenarios, migration strategies, and production deep dives for senior engineers.

Message Ordering with Partition Keys

Kafka guarantees **total ordering within a partition**. Messages written to the same partition are always consumed in the exact order they were produced.

Message Queues & Streaming

Guide to asynchronous messaging systems including Kafka, RabbitMQ, SQS, event sourcing, pub/sub patterns, consumer groups, ordering guarantees, and exactly-once semantics.

Monitoring & Operations

Consumer lag is the most important consumer metric:

Parallel Consumer Deep Dive

Deep dive into the Confluent Parallel Consumer model for decoupling thread concurrency from partition counts safely.

Partitions

A **partition** is an ordered, immutable sequence of records (a log) within a topic. Each partition lives on exactly one broker at a time (as leader) and.

Preventing Kafka Connect Rebalance Storms

A comprehensive guide to tuning Kafka Connect to prevent stop-the-world rebalance storms during routine patching and rolling restarts.

Processing and Ordering

Kafka guarantees ordering within a partition, but single-threaded processing limits throughput. This guide covers four patterns for achieving high throughput while preserving per-key ordering.

Producer Acknowledgements (acks)

The `acks` configuration controls **how many broker acknowledgements the producer requires before considering a send successful**. It directly trades off.

Producer Transactions

Idempotence protects against duplicates within a session, but it doesn't help when:

Raft Consensus Algorithm

A comprehensive guide to the Raft Consensus Algorithm — covering leader election, log replication, safety guarantees, and how it is implemented in Apache Kafka's KRaft metadata mode.

Real-Time Updates

Patterns for delivering real-time data to clients including WebSockets, Server-Sent Events, long polling, short polling, and push notification architectures.

Replication, ISR & Fault Tolerance

The **replication factor** defines how many copies of each partition exist across the cluster.

Scaling Partitions

Partitions are the unit of parallelism in Kafka. Scaling them is critical for throughput but can break ordering for keyed topics. This guide covers the mechanics, risks, and migration strategies.

Scaling Writes

Deep-dive into high write throughput techniques — sharding, partitioning, WAL internals, LSM trees, async pipelines, batching, backpressure, idempotency, and distributed transactions — with production Java/Spring code and failure mode analysis.

Schema Registry

**Schema Registry** is a centralized repository for managing and validating schemas for Kafka messages. It ensures that producers and consumers agree on the.

Transactional Outbox Pattern

A complete guide to the Transactional Outbox Pattern — from the Dual-Write problem for beginners to CDC vs polling internals, at-least-once guarantees, ordering semantics, and production monitoring for senior engineers.

Walmart Java Developer Interview Experience & Questions [30 LPA+]

A detailed collection of real interview questions and answers from a Walmart Java Developer interview. Ideal for candidates with 3+ years of experience, covering DSA, Core Java, System Design, Spring Boot, and Kafka.