Event Hubs
Azure Event Hubs is a fully managed, real-time data streaming platform capable of processing millions of events per second. It’s Azure’s equivalent of Apache Kafka or AWS Kinesis.
Event Hubs vs Other Messaging Services
Section titled “Event Hubs vs Other Messaging Services”| Service | Model | Throughput | Retention | Best For |
|---|---|---|---|---|
| Event Hubs | Streaming (partitioned log) | Millions/sec | 1–90 days (or Capture) | Telemetry, logs, analytics |
| Service Bus | Message broker (queues/topics) | Thousands/sec | Up to 14 days | Transactional messaging |
| Event Grid | Event routing (push) | Millions/sec | 24 hr retry | Reacting to events |
| Queue Storage | Simple queue | Moderate | 7 days | Basic task queuing |
Rule of thumb: Use Event Hubs when you need high-throughput, ordered, replayable event streaming. Use Service Bus when you need reliable message processing with features like sessions, dead-letter, and transactions.
Core Concepts
Section titled “Core Concepts”Namespace, Event Hub, Partitions, Consumer Groups
Section titled “Namespace, Event Hub, Partitions, Consumer Groups”Event Hubs Namespace (container, billing unit) └── Event Hub: "telemetry-events" (like a Kafka topic) ├── Partition 0: [e1] [e4] [e7] [e10] ... ├── Partition 1: [e2] [e5] [e8] [e11] ... ├── Partition 2: [e3] [e6] [e9] [e12] ... │ ├── Consumer Group: "$Default" (each CG reads all partitions independently) └── Consumer Group: "analytics"| Concept | Description |
|---|---|
| Namespace | Container for one or more Event Hubs; defines the billing and throughput tier |
| Event Hub | A named stream (equivalent to a Kafka topic) |
| Partition | Ordered sequence of events; enables parallel reads; partition key determines placement |
| Consumer group | Independent view of the stream; each group tracks its own offset per partition |
| Event | A data record (body + properties + metadata) |
Partitions and Ordering
Section titled “Partitions and Ordering”Events with the same partition key go to the same partition, guaranteeing order for that key:
Partition key: "device-123" → always goes to Partition 1Partition key: "device-456" → always goes to Partition 0
Within Partition 1, events for device-123 are strictly ordered.Events across partitions have no ordering guarantee.Creating an Event Hub
Section titled “Creating an Event Hub”# Create a namespaceaz eventhubs namespace create \ --resource-group myapp-rg \ --name myapp-events \ --sku Standard \ --location eastus
# Create an Event Hub with 4 partitions, 7-day retentionaz eventhubs eventhub create \ --resource-group myapp-rg \ --namespace-name myapp-events \ --name telemetry \ --partition-count 4 \ --message-retention 7
# Create a consumer groupaz eventhubs eventhub consumer-group create \ --resource-group myapp-rg \ --namespace-name myapp-events \ --eventhub-name telemetry \ --name analyticsSending Events (Producers)
Section titled “Sending Events (Producers)”Python SDK
Section titled “Python SDK”from azure.eventhub import EventHubProducerClient, EventDatafrom azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()producer = EventHubProducerClient( fully_qualified_namespace="myapp-events.servicebus.windows.net", eventhub_name="telemetry", credential=credential)
# Send a batch of eventswith producer: batch = producer.create_batch() batch.add(EventData('{"device": "sensor-1", "temp": 22.5}')) batch.add(EventData('{"device": "sensor-2", "temp": 23.1}')) producer.send_batch(batch)
# Send with a partition key (order guarantee for this key)with producer: batch = producer.create_batch(partition_key="device-123") batch.add(EventData('{"temp": 22.5, "ts": "2026-02-17T10:00:00Z"}')) batch.add(EventData('{"temp": 22.7, "ts": "2026-02-17T10:01:00Z"}')) producer.send_batch(batch)Other Producer Options
Section titled “Other Producer Options”- Azure CLI —
az eventhubs eventhub send - Kafka protocol — Event Hubs supports the Apache Kafka protocol (no code changes for Kafka producers).
- Azure Functions — Event Hubs output binding.
- Application Insights / Azure Monitor — Can export to Event Hubs.
Receiving Events (Consumers)
Section titled “Receiving Events (Consumers)”Python SDK with Checkpointing
Section titled “Python SDK with Checkpointing”from azure.eventhub import EventHubConsumerClientfrom azure.eventhub.extensions.checkpointstorageblob import BlobCheckpointStorefrom azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
# Checkpoint store: tracks which events each consumer group has processedcheckpoint_store = BlobCheckpointStore( blob_account_url="https://myappstorage.blob.core.windows.net", container_name="eventhub-checkpoints", credential=credential)
consumer = EventHubConsumerClient( fully_qualified_namespace="myapp-events.servicebus.windows.net", eventhub_name="telemetry", consumer_group="analytics", credential=credential, checkpoint_store=checkpoint_store)
def on_event(partition_context, event): body = event.body_as_str() print(f"Partition {partition_context.partition_id}: {body}") # Checkpoint: save progress so we don't re-process on restart partition_context.update_checkpoint(event)
with consumer: consumer.receive(on_event=on_event, starting_position="-1") # -1 = beginningCheckpointing
Section titled “Checkpointing”The consumer SDK uses Azure Blob Storage to store checkpoints — the last processed offset per partition per consumer group. This enables:
- Resume after restart — Pick up where you left off.
- Load balancing — Multiple consumer instances share partitions automatically.
Azure Functions Trigger
Section titled “Azure Functions Trigger”import azure.functions as funcimport logging
app = func.FunctionApp()
@app.event_hub_message_trigger( arg_name="events", event_hub_name="telemetry", connection="EventHubConnection", consumer_group="$Default", cardinality="many")def process_telemetry(events: list[func.EventHubEvent]): for event in events: logging.info(f"Event: {event.get_body().decode()}")Kafka Compatibility
Section titled “Kafka Compatibility”Event Hubs exposes a Kafka-compatible endpoint — existing Kafka producers and consumers work with zero code changes:
# Kafka producer config pointing to Event Hubsbootstrap.servers=myapp-events.servicebus.windows.net:9093security.protocol=SASL_SSLsasl.mechanism=OAUTHBEARER# ... (Azure identity-based auth)Use cases:
- Migrate from self-managed Kafka to a fully managed service.
- Use Kafka ecosystem tools (Kafka Connect, Kafka Streams) with Event Hubs as the backend.
Event Hubs Capture
Section titled “Event Hubs Capture”Capture automatically writes events to Azure Blob Storage or Data Lake Storage in Avro format — zero-code archival for batch analytics:
# Enable Captureaz eventhubs eventhub update \ --resource-group myapp-rg \ --namespace-name myapp-events \ --name telemetry \ --enable-capture true \ --capture-destination blob \ --storage-account myappstorage \ --blob-container event-capture \ --capture-interval 300 \ # flush every 5 minutes --capture-size-limit 314572800 # or every 300 MBCaptured data can be queried with Azure Synapse, Databricks, or Data Lake Analytics.
Event Hubs Tiers
Section titled “Event Hubs Tiers”| Tier | Throughput Units | Partitions | Retention | Key Features |
|---|---|---|---|---|
| Basic | 1–20 TUs | 32 max | 1 day | Low cost, limited features |
| Standard | 1–40 TUs | 32 max | 1–7 days | Consumer groups, Capture, Kafka |
| Premium | Processing Units (PUs) | 100 max | Up to 90 days | Dedicated resources, VNet, dynamic partitions |
| Dedicated | Capacity Units (CUs) | Unlimited | Up to 90 days | Single-tenant, highest throughput |
Throughput Unit (TU): 1 TU = 1 MB/s ingress + 2 MB/s egress (or 1,000 events/s ingress).
Auto-Inflate (Standard Tier)
Section titled “Auto-Inflate (Standard Tier)”Auto-inflate automatically scales TUs up when traffic increases:
az eventhubs namespace update \ --resource-group myapp-rg \ --name myapp-events \ --enable-auto-inflate true \ --maximum-throughput-units 20Common Patterns
Section titled “Common Patterns”Fan-Out with Consumer Groups
Section titled “Fan-Out with Consumer Groups”telemetry Event Hub ├── Consumer Group: "realtime" ──► Dashboard service (live metrics) ├── Consumer Group: "analytics" ──► Data pipeline (Synapse, Databricks) └── Consumer Group: "alerts" ──► Alerting service (threshold checks)Each consumer group reads the full stream independently — no message loss.
Event Hubs + Stream Analytics
Section titled “Event Hubs + Stream Analytics”Azure Stream Analytics can process Event Hubs data in real time with SQL-like queries:
-- Detect high temperature from IoT sensorsSELECT DeviceId, AVG(Temperature) as AvgTemp, System.Timestamp() as WindowEndFROM telemetry TIMESTAMP BY EventEnqueuedUtcTimeGROUP BY DeviceId, TumblingWindow(minute, 5)HAVING AVG(Temperature) > 30Event Hubs as Log Pipeline
Section titled “Event Hubs as Log Pipeline”Application ──► Event Hubs ──► Capture ──► Blob Storage (archive) │ └──► Azure Function ──► Log Analytics / ElasticEvent Hubs vs Kafka vs Kinesis
Section titled “Event Hubs vs Kafka vs Kinesis”| Feature | Event Hubs | Apache Kafka | AWS Kinesis |
|---|---|---|---|
| Managed | Fully managed | Self-managed (or Confluent) | Fully managed |
| Protocol | AMQP + Kafka | Kafka | AWS proprietary |
| Partitions | Up to 100 (Premium) | Unlimited | Up to 500 shards |
| Retention | Up to 90 days | Configurable | Up to 365 days |
| Throughput | TU-based (auto-inflate) | Broker-based | Shard-based |
| Capture/archive | Built-in (Avro to Blob) | Kafka Connect | Firehose (separate service) |
| Consumer groups | Up to 20 (Standard) | Unlimited | Shared/Enhanced fan-out |
Key Takeaways
Section titled “Key Takeaways”- Event Hubs is for high-throughput event streaming — telemetry, logs, real-time analytics.
- Events are distributed across partitions; use a partition key for ordering guarantees per key.
- Consumer groups allow multiple consumers to independently read the full stream.
- Checkpointing (in Blob Storage) enables resume-after-restart and load balancing.
- Capture archives events to Blob/Data Lake for batch analytics — zero code.
- Event Hubs supports the Kafka protocol — migrate from self-managed Kafka with no code changes.
- Use Event Hubs for streaming; use Service Bus for reliable message processing.