Kinesis and Streaming

First PublishedFeb 16, 2026ByAtif Alam

Kinesis is AWS’s platform for real-time data streaming. While SQS is for task queues and decoupling, Kinesis is for high-throughput, ordered, real-time data — log pipelines, clickstreams, IoT telemetry, and event processing.

Kinesis Services

Service	What It Does	Comparable To
Kinesis Data Streams	Ingest and process real-time data with custom consumers	Apache Kafka
Kinesis Data Firehose	Deliver streaming data to S3, Redshift, Elasticsearch, etc. (zero code)	Kafka Connect, Fluentd
Kinesis Data Analytics	Run SQL or Apache Flink on streaming data	Kafka Streams, Apache Flink

Kinesis Data Streams

How It Works

1
Producers ──put──► ┌─────────────────────────────┐ ──get──► Consumers
2
(apps, agents,     │  Kinesis Data Stream          │         (Lambda, apps,
3
 SDK, Firehose)    │  ┌──────┐ ┌──────┐ ┌──────┐ │         KDA, Firehose)
4
                   │  │Shard1│ │Shard2│ │Shard3│ │
5
                   │  └──────┘ └──────┘ └──────┘ │
6
                   └─────────────────────────────┘

Producers send records to the stream. Each record has a partition key and a data blob.
The partition key determines which shard receives the record (hash-based).
Consumers read records from shards in order.
Records are retained for 24 hours (default) up to 365 days.

Key Concepts

Concept	What It Is
Stream	A collection of shards — the top-level resource
Shard	An ordered sequence of records. Each shard provides 1 MB/s write, 2 MB/s read.
Record	A data blob (up to 1 MB) + partition key + sequence number
Partition key	Determines shard assignment. Same key → same shard → ordered.
Sequence number	Assigned by Kinesis — unique, increasing per shard
Consumer	An application reading from one or more shards

Capacity

	Per Shard
Write	1,000 records/s or 1 MB/s
Read	5 reads/s, 2 MB/s (shared) or 2 MB/s per consumer (enhanced)

Need more throughput? Add more shards. A 10-shard stream handles 10,000 records/s writes.

Creating a Stream

1
# Create a stream with 3 shards
2
aws kinesis create-stream --stream-name clickstream --shard-count 3
3

4
# Or use on-demand mode (auto-scales)
5
aws kinesis create-stream --stream-name clickstream \
6
  --stream-mode-details StreamMode=ON_DEMAND

Producing Records

1
import boto3, json
2

3
kinesis = boto3.client('kinesis')
4

5
kinesis.put_record(
6
    StreamName='clickstream',
7
    Data=json.dumps({
8
        'user_id': 'user_123',
9
        'event': 'page_view',
10
        'url': '/products/widget',
11
        'timestamp': '2026-02-16T10:30:00Z'
12
    }),
13
    PartitionKey='user_123'   # same user → same shard → ordered
14
)

Consuming Records

With Lambda (simplest):

1
aws lambda create-event-source-mapping \
2
  --function-name process-clickstream \
3
  --event-source-arn arn:aws:kinesis:us-east-1:123456789012:stream/clickstream \
4
  --starting-position LATEST \
5
  --batch-size 100

Lambda receives batches of records and processes them:

1
def handler(event, context):
2
    for record in event['Records']:
3
        data = json.loads(base64.b64decode(record['kinesis']['data']))
4
        # process the clickstream event
5
        print(f"User {data['user_id']} viewed {data['url']}")

With the Kinesis Client Library (KCL):

KCL is a Java/Python library that handles shard assignment, checkpointing, and load balancing across multiple consumers. Use it for long-running consumer applications.

Enhanced Fan-Out

By default, all consumers share a shard’s 2 MB/s read throughput. Enhanced fan-out gives each consumer a dedicated 2 MB/s pipe via HTTP/2 push:

	Standard	Enhanced Fan-Out
Throughput	2 MB/s shared across consumers	2 MB/s per consumer
Latency	~200ms	~70ms
Delivery	Pull (consumers poll)	Push (HTTP/2)
Cost	Included	Additional per-shard per-consumer fee

Kinesis Data Firehose

Firehose is the zero-code option — it delivers streaming data to a destination with optional transformation. No shard management, no consumers to write.

How It Works

1
Producers ──► Firehose Delivery Stream ──► Destination
2
                    │
3
                    ├── Buffer (size or time)
4
                    ├── Transform (Lambda, optional)
5
                    └── Convert format (Parquet/ORC, optional)

Supported Destinations

Destination	Use Case
S3	Data lake, log archive, backup
Redshift	Data warehouse (loads via S3)
OpenSearch	Log analytics, full-text search
Splunk	Security analytics
HTTP endpoint	Any custom API
Datadog / New Relic	Third-party monitoring

Creating a Firehose Delivery Stream

1
aws firehose create-delivery-stream \
2
  --delivery-stream-name logs-to-s3 \
3
  --s3-destination-configuration '{
4
    "RoleARN": "arn:aws:iam::123456789012:role/FirehoseS3Role",
5
    "BucketARN": "arn:aws:s3:::my-log-bucket",
6
    "Prefix": "logs/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/",
7
    "BufferingHints": {
8
      "SizeInMBs": 128,
9
      "IntervalInSeconds": 300
10
    },
11
    "CompressionFormat": "GZIP"
12
  }'

Firehose Buffering

Firehose buffers data before writing to the destination:

Setting	Range	Default
Buffer size	1–128 MB	5 MB
Buffer interval	60–900 seconds	300 seconds (5 min)

Whichever threshold is hit first triggers a delivery. For near-real-time, set low values (1 MB / 60 seconds).

Data Transformation

Attach a Lambda function to transform records before delivery:

1
def handler(event, context):
2
    output = []
3
    for record in event['records']:
4
        data = json.loads(base64.b64decode(record['data']))
5

6
        # Add a field, filter, or reformat
7
        data['processed_at'] = datetime.utcnow().isoformat()
8

9
        output.append({
10
            'recordId': record['recordId'],
11
            'result': 'Ok',
12
            'data': base64.b64encode(json.dumps(data).encode()).decode()
13
        })
14
    return {'records': output}

Format Conversion

Firehose can convert JSON to Apache Parquet or ORC before writing to S3 — significantly smaller files and faster queries in Athena/Redshift:

1
JSON (1 GB) ──► Firehose ──► Parquet (~100 MB) ──► S3 ──► Athena (fast queries)

Kinesis vs SQS

Feature	Kinesis Data Streams	SQS
Model	Stream (ordered, replayable)	Queue (one-time processing)
Ordering	Per-shard (guaranteed)	Best-effort (standard) or FIFO
Retention	24 hours – 365 days	4 – 14 days
Replay	Yes (re-read from any point)	No (deleted after processing)
Multiple consumers	Yes (each reads independently)	No (one consumer per message)
Throughput	Very high (add shards)	Virtually unlimited
Latency	~200ms (standard), ~70ms (enhanced)	~1ms – few seconds
Cost model	Per shard-hour + per GB	Per request
Best for	Real-time analytics, log pipelines, event sourcing	Task queues, decoupling, buffering

When to Use Which

Scenario	Choose
Process each message once, then delete	SQS
Multiple consumers need the same data	Kinesis
Need to replay old messages	Kinesis
Real-time analytics / dashboards	Kinesis
Simple task queue / work distribution	SQS
Log aggregation → S3 (no code)	Firehose

Common Patterns

Real-Time Log Pipeline

1
App logs ──► Kinesis Data Stream ──► Lambda (enrich) ──► Firehose ──► S3 (Parquet)
2
                                                                        │
3
                                                                   Athena (query)

Clickstream Analytics

1
Web app ──► API Gateway ──► Kinesis ──► Lambda (aggregate) ──► DynamoDB (real-time)
2
                                    └──► Firehose ──► S3 ──► Redshift (batch analytics)

IoT Data Ingestion

1
IoT devices ──► IoT Core ──► Kinesis ──► Lambda (anomaly detection) ──► SNS (alert)
2
                                     └──► Firehose ──► S3 (historical data)

Key Takeaways

Kinesis Data Streams is for high-throughput, ordered, real-time data. Scale by adding shards. Use Lambda or KCL as consumers.
Kinesis Data Firehose is the zero-code option — delivers data to S3, Redshift, or OpenSearch with buffering, transformation, and format conversion.
Use Kinesis when you need ordering, multiple consumers, or replay. Use SQS for simple task queues.
Partition keys determine shard assignment — use a high-cardinality key (user ID, device ID) for even distribution.
Firehose + Parquet conversion is the cheapest way to build a queryable data lake on S3.