And why trying to avoid duplicates is how systems really fail
SEO keywords: at least once delivery explained, message delivery guarantees, Kafka at least once, Spring Boot messaging reliability, duplicate message handling
π The Complaint Every Team Makes
"We're seeing duplicate messages."
This sentence usually comes with panic.
Engineers rush to:
- Add deduplication flags
- Add locks
- Add distributed transactions
- Add complexity
All in an attempt to fight something that is not a flaw.
π§ The Missing Mental Model
Message delivery guarantees are not promises. They are trade-offs.
The Three Truths of Distributed Systems
You can choose:
- At-most-once β lose messages
- At-least-once β process duplicates
- Exactly-once β doesn't exist in reality
At-least-once exists because losing data is worse than duplicating it.
π₯ Why At-Least-Once Is the Default (On Purpose)
Systems like:
- Kafka
- RabbitMQ
- SQS
- Pulsar
All default to at-least-once.
Why?
Because when failure happens:
- Networks drop
- Consumers crash
- ACKs get lost
- Retries happen
The system cannot know if your code ran.
So it delivers again.
That's safety, not sloppiness.
𧨠The Dangerous Lie: "Duplicates Are Bad"
Duplicates feel wrong because:
- They're visible
- They break naΓ―ve logic
- They feel inefficient
But lost messages:
- Are invisible
- Corrupt data silently
- Can't be recovered
Teams fear duplicates because they're seen. They should fear loss β because it isn't.
β οΈ Spring Boot Example: The NaΓ―ve Consumer
@KafkaListener(topics = "orders")
public void handle(OrderEvent event) {
orderService.process(event);
}Works perfectly.
Until:
- Consumer crashes after processing
- Offset isn't committed
- Message is re-delivered
Now the order processes twice.
This is not Kafka misbehaving.
This is Kafka protecting you.
π§ Why Exactly-Once Is a Myth
Even with:
- Transactions
- 2PC
- Idempotent producers
- EOS configs
Failures still exist:
- JVM crashes
- Network partitions
- Partial commits
"Exactly once" usually means:
"Exactly once under ideal conditions."
Production is not ideal.
π£ Where Teams Go Wrong
They try to:
- Suppress duplicates
- Acknowledge early
- Skip retries
- Add global locks
All of these: π increase the chance of data loss
Which is far worse.
β The Correct Way to Design for At-Least-Once
πΉ 1. Make Consumers Idempotent
if (processed(eventId)) return;
process(event);Idempotency is the real guarantee.
πΉ 2. Use Natural Idempotency
- Upserts
- State transitions
- Version checks
Let the domain absorb duplicates.
πΉ 3. Commit After Side Effects
Never before.
If you crash: π replay is correct.
πΉ 4. Observe Duplicate Rates
Duplicates are signals β not bugs.
High duplicates = instability somewhere.
πΉ 5. Embrace Replayability
Reprocessing is power.
At-least-once gives you rewind.
π§ The Spring Boot Truth Nobody Teaches Early
Delivery guarantees don't exist to make developers comfortable.
They exist to make systems survivable.
At-least-once isn't imperfect.
It's honest.
π Final Takeaway (Highly Shareable)
A system that fears duplicates will lose data. A system that accepts duplicates will survive failure.