Kafka + Delivery Semantics
Exactly-once vs at-least-once delivery in Kafka
At-least-once means messages are not lost but may be processed more than once. Exactly-once means the final processing effect happens once, which usually requires transactions, idempotency, or careful system design.
The Short Answer
At-least-once means the message should not be lost, but it may be processed more than once.
Exactly-once means the final processing effect should happen once, even if retries, crashes, or rebalances happen.
The Real Problem
Imagine a consumer reads a Kafka message, charges a customer, and then crashes before committing the offset.
When the consumer restarts, Kafka may give it the same message again because the offset was not committed. That protects you from losing the message, but now your application may repeat the side effect.
At-least-once
Good: message is not lost. Risk: duplicate processing.
Exactly-once goal
The goal is not just delivery. The goal is one correct final state.
Why At-least-once Is Common
At-least-once is popular because it is practical and resilient. If something fails, the system can retry.
But retrying means your application must tolerate duplicates. That is why idempotency is so important in distributed systems.
// Example idea: ignore duplicate event IDs
if (processedEventIds.contains(event.id())) {
return;
}
process(event);
processedEventIds.add(event.id());Where Duplicates Come From
- Producer sends a record, times out, and retries.
- Consumer processes a record but crashes before committing.
- A rebalance causes another consumer to resume from the last committed offset.
- An external database write succeeds, but Kafka offset commit fails.
The Critical Interview Insight
Kafka can help with exactly-once processing when the pipeline stays inside Kafka: consume from one topic, transform, produce to another topic, and commit offsets transactionally.
But if your consumer writes to an external database, sends an email, charges a credit card, or calls another service, Kafka cannot automatically make that external side effect exactly-once.
Mental Model: Offset Commit vs Processing
Commit after processing
Safer against loss, but duplicates can happen if processing succeeds and the commit fails.
Commit before processing
Avoids duplicate processing, but risks message loss if the app crashes after commit and before processing.
How to Explain This in an Interview
Common Follow-ups
Is exactly-once always better?
No. It adds complexity and may add overhead. Many systems use at-least-once with idempotent consumers because it is simpler and robust.
Can Kafka prevent all duplicate effects?
Not by itself when external systems are involved. Kafka can coordinate Kafka records and offsets, but your database, payment system, or email provider needs its own idempotency strategy.
What is the safest default answer?
Assume duplicates can happen. Design consumers to be idempotent unless you have a very specific exactly-once transactional design.