Kinesis and Streaming
Kinesis is AWS’s platform for real-time data streaming. While SQS is for task queues and decoupling, Kinesis is for high-throughput, ordered, real-time data — log pipelines, clickstreams, IoT telemetry, and event processing.
Kinesis Services
Section titled “Kinesis Services”| Service | What It Does | Comparable To |
|---|---|---|
| Kinesis Data Streams | Ingest and process real-time data with custom consumers | Apache Kafka |
| Kinesis Data Firehose | Deliver streaming data to S3, Redshift, Elasticsearch, etc. (zero code) | Kafka Connect, Fluentd |
| Kinesis Data Analytics | Run SQL or Apache Flink on streaming data | Kafka Streams, Apache Flink |
Kinesis Data Streams
Section titled “Kinesis Data Streams”How It Works
Section titled “How It Works”Producers ──put──► ┌─────────────────────────────┐ ──get──► Consumers(apps, agents, │ Kinesis Data Stream │ (Lambda, apps, SDK, Firehose) │ ┌──────┐ ┌──────┐ ┌──────┐ │ KDA, Firehose) │ │Shard1│ │Shard2│ │Shard3│ │ │ └──────┘ └──────┘ └──────┘ │ └─────────────────────────────┘- Producers send records to the stream. Each record has a partition key and a data blob.
- The partition key determines which shard receives the record (hash-based).
- Consumers read records from shards in order.
- Records are retained for 24 hours (default) up to 365 days.
Key Concepts
Section titled “Key Concepts”| Concept | What It Is |
|---|---|
| Stream | A collection of shards — the top-level resource |
| Shard | An ordered sequence of records. Each shard provides 1 MB/s write, 2 MB/s read. |
| Record | A data blob (up to 1 MB) + partition key + sequence number |
| Partition key | Determines shard assignment. Same key → same shard → ordered. |
| Sequence number | Assigned by Kinesis — unique, increasing per shard |
| Consumer | An application reading from one or more shards |
Capacity
Section titled “Capacity”| Per Shard | |
|---|---|
| Write | 1,000 records/s or 1 MB/s |
| Read | 5 reads/s, 2 MB/s (shared) or 2 MB/s per consumer (enhanced) |
Need more throughput? Add more shards. A 10-shard stream handles 10,000 records/s writes.
Creating a Stream
Section titled “Creating a Stream”# Create a stream with 3 shardsaws kinesis create-stream --stream-name clickstream --shard-count 3
# Or use on-demand mode (auto-scales)aws kinesis create-stream --stream-name clickstream \ --stream-mode-details StreamMode=ON_DEMANDProducing Records
Section titled “Producing Records”import boto3, json
kinesis = boto3.client('kinesis')
kinesis.put_record( StreamName='clickstream', Data=json.dumps({ 'user_id': 'user_123', 'event': 'page_view', 'url': '/products/widget', 'timestamp': '2026-02-16T10:30:00Z' }), PartitionKey='user_123' # same user → same shard → ordered)Consuming Records
Section titled “Consuming Records”With Lambda (simplest):
aws lambda create-event-source-mapping \ --function-name process-clickstream \ --event-source-arn arn:aws:kinesis:us-east-1:123456789012:stream/clickstream \ --starting-position LATEST \ --batch-size 100Lambda receives batches of records and processes them:
def handler(event, context): for record in event['Records']: data = json.loads(base64.b64decode(record['kinesis']['data'])) # process the clickstream event print(f"User {data['user_id']} viewed {data['url']}")With the Kinesis Client Library (KCL):
KCL is a Java/Python library that handles shard assignment, checkpointing, and load balancing across multiple consumers. Use it for long-running consumer applications.
Enhanced Fan-Out
Section titled “Enhanced Fan-Out”By default, all consumers share a shard’s 2 MB/s read throughput. Enhanced fan-out gives each consumer a dedicated 2 MB/s pipe via HTTP/2 push:
| Standard | Enhanced Fan-Out | |
|---|---|---|
| Throughput | 2 MB/s shared across consumers | 2 MB/s per consumer |
| Latency | ~200ms | ~70ms |
| Delivery | Pull (consumers poll) | Push (HTTP/2) |
| Cost | Included | Additional per-shard per-consumer fee |
Kinesis Data Firehose
Section titled “Kinesis Data Firehose”Firehose is the zero-code option — it delivers streaming data to a destination with optional transformation. No shard management, no consumers to write.
How It Works
Section titled “How It Works”Producers ──► Firehose Delivery Stream ──► Destination │ ├── Buffer (size or time) ├── Transform (Lambda, optional) └── Convert format (Parquet/ORC, optional)Supported Destinations
Section titled “Supported Destinations”| Destination | Use Case |
|---|---|
| S3 | Data lake, log archive, backup |
| Redshift | Data warehouse (loads via S3) |
| OpenSearch | Log analytics, full-text search |
| Splunk | Security analytics |
| HTTP endpoint | Any custom API |
| Datadog / New Relic | Third-party monitoring |
Creating a Firehose Delivery Stream
Section titled “Creating a Firehose Delivery Stream”aws firehose create-delivery-stream \ --delivery-stream-name logs-to-s3 \ --s3-destination-configuration '{ "RoleARN": "arn:aws:iam::123456789012:role/FirehoseS3Role", "BucketARN": "arn:aws:s3:::my-log-bucket", "Prefix": "logs/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/", "BufferingHints": { "SizeInMBs": 128, "IntervalInSeconds": 300 }, "CompressionFormat": "GZIP" }'Firehose Buffering
Section titled “Firehose Buffering”Firehose buffers data before writing to the destination:
| Setting | Range | Default |
|---|---|---|
| Buffer size | 1–128 MB | 5 MB |
| Buffer interval | 60–900 seconds | 300 seconds (5 min) |
Whichever threshold is hit first triggers a delivery. For near-real-time, set low values (1 MB / 60 seconds).
Data Transformation
Section titled “Data Transformation”Attach a Lambda function to transform records before delivery:
def handler(event, context): output = [] for record in event['records']: data = json.loads(base64.b64decode(record['data']))
# Add a field, filter, or reformat data['processed_at'] = datetime.utcnow().isoformat()
output.append({ 'recordId': record['recordId'], 'result': 'Ok', 'data': base64.b64encode(json.dumps(data).encode()).decode() }) return {'records': output}Format Conversion
Section titled “Format Conversion”Firehose can convert JSON to Apache Parquet or ORC before writing to S3 — significantly smaller files and faster queries in Athena/Redshift:
JSON (1 GB) ──► Firehose ──► Parquet (~100 MB) ──► S3 ──► Athena (fast queries)Kinesis vs SQS
Section titled “Kinesis vs SQS”| Feature | Kinesis Data Streams | SQS |
|---|---|---|
| Model | Stream (ordered, replayable) | Queue (one-time processing) |
| Ordering | Per-shard (guaranteed) | Best-effort (standard) or FIFO |
| Retention | 24 hours – 365 days | 4 – 14 days |
| Replay | Yes (re-read from any point) | No (deleted after processing) |
| Multiple consumers | Yes (each reads independently) | No (one consumer per message) |
| Throughput | Very high (add shards) | Virtually unlimited |
| Latency | ~200ms (standard), ~70ms (enhanced) | ~1ms – few seconds |
| Cost model | Per shard-hour + per GB | Per request |
| Best for | Real-time analytics, log pipelines, event sourcing | Task queues, decoupling, buffering |
When to Use Which
Section titled “When to Use Which”| Scenario | Choose |
|---|---|
| Process each message once, then delete | SQS |
| Multiple consumers need the same data | Kinesis |
| Need to replay old messages | Kinesis |
| Real-time analytics / dashboards | Kinesis |
| Simple task queue / work distribution | SQS |
| Log aggregation → S3 (no code) | Firehose |
Common Patterns
Section titled “Common Patterns”Real-Time Log Pipeline
Section titled “Real-Time Log Pipeline”App logs ──► Kinesis Data Stream ──► Lambda (enrich) ──► Firehose ──► S3 (Parquet) │ Athena (query)Clickstream Analytics
Section titled “Clickstream Analytics”Web app ──► API Gateway ──► Kinesis ──► Lambda (aggregate) ──► DynamoDB (real-time) └──► Firehose ──► S3 ──► Redshift (batch analytics)IoT Data Ingestion
Section titled “IoT Data Ingestion”IoT devices ──► IoT Core ──► Kinesis ──► Lambda (anomaly detection) ──► SNS (alert) └──► Firehose ──► S3 (historical data)Key Takeaways
Section titled “Key Takeaways”- Kinesis Data Streams is for high-throughput, ordered, real-time data. Scale by adding shards. Use Lambda or KCL as consumers.
- Kinesis Data Firehose is the zero-code option — delivers data to S3, Redshift, or OpenSearch with buffering, transformation, and format conversion.
- Use Kinesis when you need ordering, multiple consumers, or replay. Use SQS for simple task queues.
- Partition keys determine shard assignment — use a high-cardinality key (user ID, device ID) for even distribution.
- Firehose + Parquet conversion is the cheapest way to build a queryable data lake on S3.