Apache Kafka Internals: A Comprehensive Handbook

Apache Kafka is a distributed event streaming platform designed for high-throughput, low-latency, and durable log storage. Kafka acts as a…

Anurag Goel

Stackademic

· ~4 min read · September 6, 2025 (Updated: September 9, 2025) · Free: No

Apache Kafka is a distributed event streaming platform designed for high-throughput, low-latency, and durable log storage. Kafka acts as a commit log where producers append records and consumers read them in order.

Kafka is fundamentally a storage system optimized for sequential writes. Instead of deleting data after consumption, it retains logs for a configured time or size. This enables replay and decouples producers from consumers.

Scenario: In an e-commerce system, Kafka stores all order events. Downstream systems (billing, analytics, fraud detection) consume these events independently at their own pace.

Topics, Partitions & Offsets

Topics: Logical categories of messages.
Partitions: Logs within topics, ordered and immutable.
Offsets: Sequential IDs for each record within a partition.

Partitions are the unit of parallelism. More partitions = more consumer concurrency. Offsets are managed per consumer group, enabling fault-tolerant parallel consumption.

Scenario: A payments topic with 12 partitions allows 12 consumers in a group to process events concurrently, scaling throughput linearly.

Storage Internals

Kafka stores messages as log segments on disk:

Append-Only: Producers append records sequentially.
Log Segments: Files split by size/time (default 1GB). Partition data split into .log and .index files.
Index Files: Maintain mapping between offsets and physical positions.
Compaction: Retains the latest record per key, discarding older ones.
Tiered Storage (2.8+): Moves cold data to S3/GCS while hot data stays local.

Sequential writes + page cache usage enable Kafka to outperform many databases. Compaction is critical for topics like user profiles, where only the latest value matters.

Scenario: A user settings topic uses log compaction, ensuring only the latest preference survives, keeping storage efficient.

Producers

Producers send records to brokers:

Partitioner: Chooses partition (round-robin, hash key, custom).
Batching: Groups messages for efficiency.
ACKs: Configurable (acks=0, acks=1, acks=all).

Choosing acks=all with replication factor >1 ensures durability. acks=0 maximizes throughput but risks data loss.

Scenario: A metrics pipeline uses acks=0 for ultra-low latency. A financial trading system uses acks=all with idempotence enabled for strict durability.

Pitfalls

Hot partitions → broker imbalance.
Compression saves network but may saturate CPU.

Consumers

Consumers read sequentially:

Consumer Groups: Coordinate parallel processing.
Rebalancing: Redistributes partitions when group members change.
Offset Management: Committed to Kafka or external stores.
Consumer lag ≠ data loss; often a sign of downstream bottlenecks.
Large-scale deployments face thundering herd issues during rebalances.

Rebalancing pauses consumption temporarily. Kafka 2.4+ introduced incremental rebalancing to reduce disruption.

Scenario: In a fraud detection system, if a consumer crashes, its partitions are reassigned to healthy instances, ensuring no events are skipped.

Replication & Fault Tolerance

Kafka ensures durability with replication:

Replication Factor: Number of copies per partition.
Leader & Followers: One broker is leader, others replicate.
ISR (In-Sync Replicas): Followers fully caught up with leader.
Unclean leader election favors availability but risks durability.
Re-replication floods network unless throttled.
Cross-region replication (MirrorMaker 2.0) → eventual consistency vs synchronous high-latency trade-offs.

Producers can require writes to be acknowledged only if committed to ISR. Leader elections occur when leaders fail; unclean leader election risks data loss.

Scenario: A 3-replica topic ensures durability. If one broker fails, another replica takes over without losing data.

Controller & Metadata Management

Controller Broker: Manages partition leaders and replicas.
Zookeeper (Legacy): Stored metadata, coordinated elections.
KRaft Mode (Kafka 2.8+): Replaces Zookeeper with internal Raft consensus.
Zookeeper vs KRaft: KRaft uses Raft consensus, simplifying ops

Metadata is cached in brokers and updated via controller notifications. Modern deployments use KRaft for simplified operations.

Scenario: A cluster upgrade moves from Zookeeper to KRaft, reducing operational complexity by eliminating external coordination service.

Networking & Protocols

Kafka uses a custom TCP protocol:

Request/Response: Producers/consumers communicate with brokers.
Zero-Copy Transfers: Kafka uses sendfile() to stream data directly from disk to network.
Compression: Batch compression (gzip, snappy, lz4, zstd).
TLS/SASL: Secure but CPU intensive — requires hardware acceleration.

Zero-copy is why Kafka achieves high throughput — no extra memory copies between user and kernel space.

A video analytics pipeline compresses logs with lz4, balancing CPU usage and throughput.

Transactions & Exactly-Once Semantics

Kafka provides exactly-once semantics (EOS):

Idempotent Producers: Prevent duplicate writes.
Transactions: Commit multiple writes atomically.
Consumer Isolation: Read only committed messages.

EOS requires coordination between producer IDs, transaction coordinator, and consumer offsets. It guarantees messages are delivered once even with retries.

Operational Challenges & Scaling Patterns

Capacity Planning

Partition sizing must balance throughput vs metadata overhead.
Retention policies determine storage costs (petabytes for week-long retention at high event rates).

Fault-Tolerance Edge Cases

Zombie Producers: Must be fenced.
Lagging Consumers: Cause storage blowups if retention mismatches demand.Optimization Strategies
Tiered Storage: Reduces cost.
Cluster Sharding: Domain-level isolation.
Hybrid Topics: Both compaction + deletion.

📌 Scenario: Uber shards clusters by business vertical (Mobility, Eats, Freight) and deploys autoscaling consumers tied to lag metrics.

Closing Thoughts

To a developer, Kafka looks like a queue, but it is:-

Designing partition & replication strategies at scale.
Managing trade-offs between latency, durability, and cost.
Running hundreds of brokers across regions reliably.
Debugging ISR shrinkage, controller failovers, and consumer lag during real outages.

Kafka is not just a pub/sub system. It is the nervous system of modern data architectures.

Enjoyed the article? Follow me on X.com for more interesting content, and check out my website at anuraggoel.in !😊🚀

A message from our Founder

Hey, Sunil here. I wanted to take a moment to thank you for reading until the end and for being a part of this community.

Did you know that our team run these publications as a volunteer effort to over 3.5m monthly readers? We don't receive any funding, we do this to support the community. ❤️

If you want to show some love, please take a moment to follow me on LinkedIn, TikTok, Instagram. You can also subscribe to our weekly newsletter.

And before you go, don't forget to clap and follow the writer️!

#web-development #programming #database #distributed-systems #system-design-interview