Skip to content

Distributed Tracing

First PublishedByAtif Alam

Distributed tracing follows a single request as it travels through multiple services, showing where time is spent and which service is the bottleneck. It completes the “three pillars” of observability alongside metrics and logs.

Metrics tell you something is slow. Logs tell you what happened in one service. Traces tell you the full story of a request across every service it touched.

User ──► API Gateway ──► Auth Service ──► User DB
└──► Product Service ──► Cache
└──► Product DB

Without tracing, debugging “why is this API call slow?” means correlating timestamps across logs from 5+ services. With tracing, you get one view that shows the exact latency of every hop.

A span represents a single unit of work — an HTTP handler, a database query, a gRPC call. Each span records:

FieldWhat It Stores
Trace IDUnique ID shared by all spans in the same request
Span IDUnique ID for this specific span
Parent Span IDThe span that triggered this one (builds the tree)
Operation nameWhat the span represents (e.g. GET /api/users)
Start / end timeWhen the operation started and finished
AttributesKey-value pairs (http.status_code=200, db.system=postgresql)
StatusOK, Error, or Unset
EventsTimestamped annotations (e.g. “cache miss at 14ms”)

A trace is a tree of spans — the root span is the entry point (e.g. the API gateway), and child spans are downstream calls:

Trace ID: abc123
[API Gateway]──────────────────────────────── 250ms
├─[Auth Service]──────── 40ms
│ └─[User DB query]── 15ms
└─[Product Service]──────────────── 180ms
├─[Cache lookup]── 2ms (cache miss)
└─[Product DB query]────── 160ms ← bottleneck

This immediately shows the Product DB query is the bottleneck.

For tracing to work across services, the trace ID must be passed between them. This is called context propagation.

How it works:

  1. Service A creates a span and generates a trace ID.
  2. Service A adds the trace ID to the outgoing HTTP headers.
  3. Service B reads the trace ID from the incoming headers.
  4. Service B creates a child span using the same trace ID.

Common propagation formats:

FormatHeaderUsed By
W3C Trace Context (standard)traceparent, tracestateOpenTelemetry, most modern tools
B3X-B3-TraceId, X-B3-SpanIdZipkin, older Jaeger
Jaegeruber-trace-idJaeger native

W3C Trace Context is the recommended standard. OpenTelemetry uses it by default.

Example traceparent header:

traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
│ │ │ │
│ │ │ └─ sampled (01=yes)
│ │ └─ parent span ID
│ └─ trace ID
└─ version

In production, tracing every request generates enormous volumes. Sampling reduces this while keeping useful data.

StrategyHow It WorksWhen to Use
Head-basedDecide at the start of the request whether to traceSimple; low overhead
Tail-basedCollect all spans, then decide after the request completesKeep errors and slow requests; discard healthy ones
Rate limitingTrace N requests per secondPredictable volume
ProbabilisticTrace X% of requestsSimple; statistically representative

Head-based is simpler but might miss interesting requests. Tail-based captures all errors and slow requests but requires a collector to buffer spans before deciding.

# OTel Collector tail-based sampling
processors:
tail_sampling:
policies:
- name: errors
type: status_code
status_code: {status_codes: [ERROR]}
- name: slow-requests
type: latency
latency: {threshold_ms: 1000}
- name: sample-rest
type: probabilistic
probabilistic: {sampling_percentage: 10}

Jaeger is an open-source, end-to-end distributed tracing system, originally built by Uber.

┌─────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────┐
│ Apps │────►│ Jaeger │────►│ Jaeger │────►│ Storage │
│ (OTel) │ │ Collector │ │ Ingester │ │ (ES/ │
└─────────┘ └──────────────┘ │ (optional) │ │ Cassand-│
└──────────────┘ │ ra) │
└──────────┘
┌──────────────┐ │
│ Jaeger UI │◄──────────────────────┘
│ (query) │
└──────────────┘
Terminal window
# all-in-one image includes collector, query, and UI with in-memory storage
docker run -d --name jaeger \
-p 16686:16686 \ # Jaeger UI
-p 4317:4317 \ # OTLP gRPC
-p 4318:4318 \ # OTLP HTTP
jaegertracing/all-in-one:latest

Open http://localhost:16686 to access the Jaeger UI.

The Jaeger UI lets you:

  • Search by service — Select a service and time range to see recent traces.
  • Search by trace ID — Paste a trace ID directly (useful when you find it in a log line).
  • Filter by tags — Find traces where http.status_code=500 or error=true.
  • Compare traces — Side-by-side comparison of two traces to spot differences.

Grafana Tempo is a high-scale, cost-effective trace backend that integrates natively with Grafana. Unlike Jaeger, Tempo stores traces in object storage (S3, GCS, Azure Blob) — no Elasticsearch or Cassandra needed.

FeatureJaegerTempo
StorageElasticsearch, CassandraObject storage (S3, GCS)
Cost at scaleHigher (indexed storage)Lower (no indexing, cheap storage)
SearchFull search by tagsTrace ID lookup + TraceQL
Grafana integrationPluginNative
IndexFull index of tagsMinimal index (by trace ID)

Tempo trades full tag-based search for much lower storage costs. It compensates with:

  • Trace ID lookup — Find a trace if you have its ID (from logs or metrics).
  • TraceQL — A query language for searching traces by structure and attributes.

TraceQL lets you search traces by span attributes, duration, and structure:

# Find traces where an HTTP span returned 500 and took > 1s
{ span.http.status_code = 500 && duration > 1s }
# Find traces that touched the "payments" service
{ resource.service.name = "payments" }
# Find traces where a database query was slow
{ span.db.system = "postgresql" && duration > 500ms }
┌──────────┐ ┌──────────────┐ ┌──────────────┐
│ Apps │────►│ Tempo │────►│ Object │
│ (OTel) │ │ Distributor │ │ Storage │
└──────────┘ └──────┬───────┘ │ (S3/GCS) │
│ └──────────────┘
▼ │
┌──────────────┐ │
│ Tempo │◄───────────┘
│ Querier │
└──────────────┘
┌──────────────┐
│ Grafana │
└──────────────┘
# docker-compose snippet
tempo:
image: grafana/tempo:latest
command: ["-config.file=/etc/tempo.yaml"]
volumes:
- ./tempo.yaml:/etc/tempo.yaml
ports:
- "4317:4317" # OTLP gRPC
- "3200:3200" # Tempo query API
# tempo.yaml
server:
http_listen_port: 3200
distributor:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
storage:
trace:
backend: local # use s3/gcs in production
local:
path: /tmp/tempo/blocks
wal:
path: /tmp/tempo/wal
metrics_generator:
processor:
service_graphs:
enabled: true
span_metrics:
enabled: true
storage:
path: /tmp/tempo/generator/wal
remote_write:
- url: http://prometheus:9090/api/v1/write

The metrics_generator in Tempo can automatically create RED metrics (Rate, Errors, Duration) from traces and push them to Prometheus — bridging traces and metrics.

The real power of distributed tracing comes from correlating all three signals.

Embed the trace ID in your log lines:

import logging
from opentelemetry import trace
logger = logging.getLogger(__name__)
def handle_request():
span = trace.get_current_span()
trace_id = format(span.get_span_context().trace_id, '032x')
logger.info("Processing request", extra={"trace_id": trace_id})

In Grafana, configure a derived field in Loki to make trace IDs clickable — clicking a trace ID in a log line jumps directly to the trace in Tempo.

# Grafana Loki data source config — derived fields
derivedFields:
- name: TraceID
matcherRegex: "trace_id=(\\w+)"
url: "${__data.fields.traceID}"
datasourceUid: tempo
urlDisplayLabel: "View Trace"

Exemplars attach a trace ID to a specific metric data point. When you see a spike in latency on a Grafana dashboard, click the exemplar dot to jump to the exact trace that caused it.

● ← exemplar (trace_id=abc123)
┌────────────┤
p99 │ │
│ ────────┤
p50 │ │
└────────────┴──────────── time

In Prometheus, exemplars are stored alongside histogram buckets. Grafana displays them as dots on graphs.

Dashboard spike → click exemplar → open trace → see slow span →
click trace_id in logs → see error message → fix bug

This is the core value of distributed tracing in an observable system.

  • Distributed tracing follows requests across services — showing latency, dependencies, and bottlenecks as a span tree.
  • Context propagation (W3C Trace Context headers) carries trace IDs between services.
  • Sampling (head-based or tail-based) controls trace volume in production.
  • Jaeger is a mature, full-featured tracing backend with tag search and a rich UI.
  • Grafana Tempo uses cheap object storage and TraceQL for cost-effective tracing at scale.
  • Correlating traces with logs and metrics (via trace IDs and exemplars) is what makes distributed tracing truly powerful.