And why trying to avoid duplicates is how systems really fail

SEO keywords: at least once delivery explained, message delivery guarantees, Kafka at least once, Spring Boot messaging reliability, duplicate message handling

πŸ˜– The Complaint Every Team Makes

"We're seeing duplicate messages."

This sentence usually comes with panic.

Engineers rush to:

  • Add deduplication flags
  • Add locks
  • Add distributed transactions
  • Add complexity

All in an attempt to fight something that is not a flaw.

🧠 The Missing Mental Model

Message delivery guarantees are not promises. They are trade-offs.

The Three Truths of Distributed Systems

You can choose:

  • At-most-once β†’ lose messages
  • At-least-once β†’ process duplicates
  • Exactly-once β†’ doesn't exist in reality

At-least-once exists because losing data is worse than duplicating it.

πŸ’₯ Why At-Least-Once Is the Default (On Purpose)

Systems like:

  • Kafka
  • RabbitMQ
  • SQS
  • Pulsar

All default to at-least-once.

Why?

Because when failure happens:

  • Networks drop
  • Consumers crash
  • ACKs get lost
  • Retries happen

The system cannot know if your code ran.

So it delivers again.

That's safety, not sloppiness.

🧨 The Dangerous Lie: "Duplicates Are Bad"

Duplicates feel wrong because:

  • They're visible
  • They break naΓ―ve logic
  • They feel inefficient

But lost messages:

  • Are invisible
  • Corrupt data silently
  • Can't be recovered

Teams fear duplicates because they're seen. They should fear loss β€” because it isn't.

⚠️ Spring Boot Example: The Naïve Consumer

@KafkaListener(topics = "orders")
public void handle(OrderEvent event) {
    orderService.process(event);
}

Works perfectly.

Until:

  • Consumer crashes after processing
  • Offset isn't committed
  • Message is re-delivered

Now the order processes twice.

This is not Kafka misbehaving.

This is Kafka protecting you.

🧠 Why Exactly-Once Is a Myth

Even with:

  • Transactions
  • 2PC
  • Idempotent producers
  • EOS configs

Failures still exist:

  • JVM crashes
  • Network partitions
  • Partial commits

"Exactly once" usually means:

"Exactly once under ideal conditions."

Production is not ideal.

πŸ’£ Where Teams Go Wrong

They try to:

  • Suppress duplicates
  • Acknowledge early
  • Skip retries
  • Add global locks

All of these: πŸ‘‰ increase the chance of data loss

Which is far worse.

βœ… The Correct Way to Design for At-Least-Once

πŸ”Ή 1. Make Consumers Idempotent

if (processed(eventId)) return;
process(event);

Idempotency is the real guarantee.

πŸ”Ή 2. Use Natural Idempotency

  • Upserts
  • State transitions
  • Version checks

Let the domain absorb duplicates.

πŸ”Ή 3. Commit After Side Effects

Never before.

If you crash: πŸ‘‰ replay is correct.

πŸ”Ή 4. Observe Duplicate Rates

Duplicates are signals β€” not bugs.

High duplicates = instability somewhere.

πŸ”Ή 5. Embrace Replayability

Reprocessing is power.

At-least-once gives you rewind.

🧠 The Spring Boot Truth Nobody Teaches Early

Delivery guarantees don't exist to make developers comfortable.

They exist to make systems survivable.

At-least-once isn't imperfect.

It's honest.

πŸš€ Final Takeaway (Highly Shareable)

A system that fears duplicates will lose data. A system that accepts duplicates will survive failure.