When a single user request fans out across a dozen Java microservices, “check the logs” stops being a strategy. You need to ask, “show me everything that happened for this request, across every service, in order” — and get an answer in seconds. That is what observability delivers, built on three pillars: structured logs, metrics, and distributed traces. This deep dive shows how enterprises wire it up for Spring Boot using the Elastic (ELK) stack and OpenTelemetry.
Each is weak alone. A metric tells you latency spiked but not why; a trace shows the slow service but not the exception detail; a log has the exception but no context about the broader request. The power comes from correlating them — and the correlation key is the trace ID.
The first step is to stop emitting free-text logs. Machine-parseable JSON, indexed in Elasticsearch, turns logs from a haystack into a queryable database. Use Logback with a JSON encoder, and put contextual fields in the MDC (Mapped Diagnostic Context) so every line carries the request’s identity.
<!-- logback-spring.xml -->
<configuration>
<appender name="JSON" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="net.logstash.logback.encoder.LogstashEncoder">
<includeMdcKeyName>traceId</includeMdcKeyName>
<includeMdcKeyName>spanId</includeMdcKeyName>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="JSON" />
</root>
</configuration>
You rarely set traceId by hand — the tracing library injects it into the MDC for you (next section). The result is log lines like this, where every event for a request shares one traceId:
{"@timestamp":"2026-06-19T14:02:11.503Z","level":"INFO",
"logger":"com.acme.OrderService","message":"order placed",
"service":"orders","traceId":"7d3a...","spanId":"a91f...","orderId":"O-8842"}
In Spring Boot 2, tracing meant Spring Cloud Sleuth. In Spring Boot 3, Sleuth is replaced by Micrometer Tracing, a vendor-neutral facade that bridges to either OpenTelemetry or Brave and propagates context automatically. OpenTelemetry (OTel) is the CNCF standard for generating and exporting telemetry; pairing the two is the current enterprise default.
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-exporter-otlp</artifactId>
</dependency>
With those on the classpath, Spring instruments incoming HTTP requests, RestClient/WebClient calls, and messaging. A trace represents the whole request; each unit of work is a span; and the trace context (W3C traceparent header) is propagated across service boundaries automatically, so service B’s spans attach to the same trace that started in service A.
management:
tracing:
sampling:
probability: 0.1 # sample 10% in high-traffic prod
otlp:
tracing:
endpoint: http://otel-collector:4318/v1/traces
Sampling is a real decision: tracing 100% of requests is expensive at scale, so teams sample a percentage (or use tail-based sampling in the OTel Collector to keep all error/slow traces while sampling the rest). To enrich a trace with a custom span, you can use the API directly:
@Observed(name = "inventory.reserve") // Micrometer Observation -> span + metric
public Reservation reserve(String sku, int qty) { /* ... */ }
ELK is the classic log platform: Elasticsearch stores and indexes, Logstash transforms, and Kibana visualizes. Modern deployments usually ship logs with lightweight Beats (Filebeat) or the OTel Collector rather than running heavyweight Logstash everywhere — collecting the JSON your services already write to stdout.
# filebeat.yml — tail container stdout, send to Elasticsearch
filebeat.inputs:
- type: container
paths: ["/var/log/containers/*.log"]
output.elasticsearch:
hosts: ["https://elasticsearch:9200"]
# JSON logs are parsed into fields, so traceId/service/level
# become first-class, filterable columns in Kibana.
In Kibana you can now filter to service: "orders" AND level: "ERROR", or paste a traceId and see every log line from every service for that one request, in timestamp order.
This is where it comes together. You spot a latency spike on a metrics dashboard, open the trace in Jaeger or Grafana Tempo, and see the time was spent in the inventory service’s database span. You copy that trace’s ID, paste it into Kibana, and instantly see the exact SQL warning and stack trace logged during that span. Metric → trace → log, joined by the trace ID, in under a minute. Without correlation, that investigation is hours of guesswork across disconnected systems.
Most enterprises do not self-host this whole stack. The cloud-native equivalents:
| Concern | AWS | Azure |
|---|---|---|
| Logs | CloudWatch Logs, or Amazon OpenSearch Service (managed Elasticsearch/Kibana) | Azure Monitor Logs / Log Analytics (KQL) |
| Traces | AWS X-Ray | Application Insights (distributed tracing) |
| Metrics | CloudWatch Metrics | Azure Monitor Metrics |
| Ingest | OTel Collector / CloudWatch agent | OTel Collector / App Insights agent |
Because OpenTelemetry is vendor-neutral, the smart move is to instrument with OTel once and point the exporter at whichever backend (self-hosted ELK + Jaeger, AWS X-Ray, Azure Monitor, Datadog) your org runs — switching backends becomes a config change, not a re-instrumentation project.
Observability for Java microservices is a pipeline: structured JSON logs carrying a trace ID, metrics for dashboards and alerts, and distributed traces from Micrometer Tracing + OpenTelemetry — all correlated so you can pivot freely between them. Build it on open standards (OTel) so you stay portable across ELK, Jaeger/Tempo, AWS X-Ray, and Azure Monitor, and you turn “it’s slow somewhere” into a precise, minutes-long diagnosis.
What replaced Spring Cloud Sleuth for tracing?
In Spring Boot 3, Spring Cloud Sleuth was replaced by Micrometer Tracing, which bridges to OpenTelemetry or Brave and exports to backends like Jaeger, Tempo, or Zipkin. It auto-propagates trace and span IDs and integrates them into your logs via MDC.
How do you correlate logs with distributed traces?
Put the trace ID and span ID into every log line (Micrometer Tracing adds them to the MDC automatically), emit logs as JSON, and index them in Elasticsearch. You can then jump from a slow span in Jaeger/Tempo to the exact log lines for that trace ID in Kibana.