Event-Driven Java with Kafka: Spring Kafka, Patterns & Exactly-Once

Synchronous REST calls couple services in time: the caller waits, and if the callee is down the call fails. Event-driven architecture flips that — services publish facts (“order placed”) to a durable log and other services react on their own schedule. Apache Kafka is the de-facto backbone for this in the enterprise, and Spring Kafka is how Java teams use it. This deep dive covers the patterns that matter in production: partitions and consumer groups, delivery semantics, error handling, the outbox pattern, and exactly-once processing.

TL;DR: Model events as immutable facts on partitioned topics; partitions are your unit of parallelism and ordering. Choose delivery semantics deliberately (at-least-once is the pragmatic default with idempotent consumers). Handle poison messages with retry + dead-letter topics. For reliable “update the DB and publish an event,” use the transactional outbox pattern rather than dual writes.
Tailor your resume to a Kafka / backend role →

Why event-driven, and what Kafka actually is

Kafka is a distributed, append-only commit log. Producers append records to topics; consumers read them at their own pace; records are retained for a configured time (or compacted) regardless of whether anyone has read them. That durability and replayability is the key difference from a traditional message queue — a new consumer can join later and reprocess history from the beginning.

The architectural payoff is loose coupling: the order service publishes OrderPlaced and doesn’t know or care that inventory, billing, and analytics each consume it. You add a new consumer without touching the producer.

Partitions, keys, and consumer groups

Each topic is split into partitions, and this single concept drives ordering and scaling:

Producing and consuming with Spring Kafka

Spring Kafka wraps the native client with a KafkaTemplate for producing and @KafkaListener for consuming. Configuration is mostly YAML.

spring:
  kafka:
    bootstrap-servers: ${KAFKA_BROKERS}
    producer:
      key-serializer: org.apache.kafka.common.serialization.StringSerializer
      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
      acks: all          # wait for all in-sync replicas — durability
    consumer:
      group-id: inventory-service
      auto-offset-reset: earliest
      enable-auto-commit: false   # commit offsets after processing, not before
@Service
class OrderEventsProducer {
  private final KafkaTemplate<String, OrderPlaced> template;
  OrderEventsProducer(KafkaTemplate<String, OrderPlaced> t) { this.template = t; }

  void publish(OrderPlaced event) {
    // key by orderId so all events for an order keep their order
    template.send("orders.placed", event.orderId(), event);
  }
}

@Component
class InventoryListener {
  @KafkaListener(topics = "orders.placed")
  void on(OrderPlaced event) {
    inventory.reserve(event.sku(), event.qty());   // do the work
  }                                                // offset committed on success
}

Two settings above are load-bearing. acks: all makes the producer wait until all in-sync replicas have the record, trading a little latency for durability. Turning off auto-commit and letting the container commit the offset only after the listener returns successfully is what gives you at-least-once delivery — if the consumer crashes mid-processing, the record is redelivered rather than silently lost.

Delivery semantics: pick one on purpose

SemanticBehaviorUse when
At-most-onceCommit before processing; a crash loses the recordLossy telemetry where speed beats completeness
At-least-onceCommit after processing; a crash redelivers (possible duplicates)The pragmatic default — combine with idempotent consumers
Exactly-onceIdempotent producer + transactions; no loss, no duplicates within KafkaKafka-to-Kafka stream processing where duplicates are unacceptable

Most services run at-least-once and make the consumer idempotent — processing the same event twice produces the same result. The cheapest way is a dedup table keyed by event ID, or designing the operation to be naturally idempotent (an upsert, a set-to-state rather than an increment).

Error handling and dead-letter topics

A “poison” message that always fails will, without guards, block its partition forever as the consumer retries it endlessly. Spring Kafka’s DefaultErrorHandler applies a backoff and a retry budget, then routes the record to a dead-letter topic (DLT) so the partition keeps moving and a human (or an automated process) can inspect the failures later.

@Bean
DefaultErrorHandler errorHandler(KafkaTemplate<Object, Object> template) {
  var recoverer = new DeadLetterPublishingRecoverer(template); // -> orders.placed.DLT
  return new DefaultErrorHandler(recoverer, new FixedBackOff(1000L, 3)); // 3 retries
}

Distinguish transient failures (a downstream timeout — worth retrying) from permanent ones (a malformed payload — send straight to the DLT). Retrying a deserialization error three times just wastes time before the inevitable.

The dual-write problem and the outbox pattern

A subtle but critical trap: a handler that writes to its database and publishes to Kafka is doing two writes to two systems with no shared transaction. If the DB commit succeeds but the Kafka publish fails (or vice versa), your systems diverge — an order exists with no OrderPlaced event, or an event with no order.

The fix is the transactional outbox: within the same database transaction that saves the order, also insert the event into an outbox table. A separate relay process (or a change-data-capture tool like Debezium) reads the outbox and publishes to Kafka, marking rows sent. Now the business write and the “intent to publish” are atomic, and the relay guarantees the event eventually reaches Kafka at-least-once.

@Transactional
public void placeOrder(Order order) {
  orderRepository.save(order);
  outboxRepository.save(OutboxEvent.from(
      "orders.placed", order.id(), new OrderPlaced(order)));   // same TX
}
// A relay/CDC process publishes unsent outbox rows to Kafka and marks them sent.

Exactly-once with Kafka transactions

For pure Kafka-to-Kafka flows (consume → transform → produce), Kafka’s transactions plus the idempotent producer give true exactly-once semantics: the produced records and the consumed offsets commit atomically. Spring Kafka enables this with a transactional producer and read_committed isolation on the consumer. Kafka Streams makes it a one-liner (processing.guarantee=exactly_once_v2). Just remember its boundary: EOS covers Kafka, not your external database — for that, you still need idempotency or the outbox.

Schema evolution: don’t skip this

Events are a contract between teams, and that contract will change. Use a schema registry (Avro, Protobuf, or JSON Schema) to enforce backward/forward compatibility so a producer adding a field can’t break existing consumers. The discipline: only make compatible changes (add optional fields, never remove or repurpose), and version events when you must break compatibility. Skipping this is how an event-driven platform turns into a coordination nightmare.

Operational notes

Takeaways

Event-driven Java on Kafka buys you loose coupling, replayability, and independent scaling — but the durability guarantees you actually get depend on choices you make: key for ordering, size partitions for parallelism, run at-least-once with idempotent consumers, dead-letter your poison messages, and use the outbox pattern to keep your database and your events consistent. Get those right and Kafka becomes a reliable nervous system for the whole platform rather than a source of mysterious data drift.

Frequently asked questions

Does Kafka guarantee exactly-once delivery?
Kafka supports exactly-once semantics (EOS) within a Kafka-to-Kafka transactional flow using idempotent producers and transactions. End-to-end exactly-once to an external system (like a database) is not automatic — you achieve effective exactly-once with the transactional outbox pattern plus idempotent consumers.

How many partitions should a Kafka topic have?
Partitions are the unit of parallelism: a consumer group can have at most one active consumer per partition. Size for your target throughput and peak consumer count, allow headroom (you can add partitions but not remove them), and avoid going excessively high since each partition adds broker overhead.

Land your next Java role — tailor your resume with AI →