Skip to content

VPC Flow Logs and Network RCA

First PublishedLast UpdatedByAtif Alam

VPC Flow Logs capture metadata about IP traffic going to and from network interfaces in your VPC — not full packet payloads.

They are the first place many teams look for network-oriented root cause analysis (RCA): unexpected denies, asymmetric paths, and “was there any traffic at all?” For packet-level detail on a host, use tcpdump / Wireshark after you narrow the window with flow logs.

Basics of VPCs, subnets, and security groups live in Networking. Cross-VPC and hybrid paths are in VPC Connectivity. GuardDuty can consume flow logs for threat detection; see Security Services.

Each record (format varies by version) typically includes:

  • Account, VPC, subnet, interface identifiers
  • Source and destination IP and port
  • Protocol (TCP, UDP, ICMP, …)
  • Packets and bytes
  • Action — AWS documents ACCEPT (permitted) and REJECT (not permitted, e.g. NACL or security group deny)
  • Log status — e.g. OK, NODATA, SKIPDATA

Flow logs do not include:

  • Application payload or HTTP URLs
  • Every possible reason for a drop (you infer from SG/NACL and routing)
  • Guaranteed ordering across all interfaces (use timestamps and correlation)

Custom format and fields are documented in VPC Flow Logs.

DestinationGood for
CloudWatch LogsLogs Insights queries, metric filters, subscriptions to Lambda/OpenSearch, smaller retention cost tradeoffs.
S3Cheap long-term storage, Athena SQL, integration with data lakes; higher latency for ad hoc queries.
Kinesis Data FirehoseStreaming to Splunk, S3, Redshift, etc.

Enable flow logs on a VPC, subnet, or ENI (Elastic Network Interface—the virtual NIC on an instance or other VPC attachment) depending on how narrow you want the scope. VPC-wide is common; ENI-level is for deep dives on one instance.

Example (conceptual — replace IDs):

Terminal window
# Log group must exist; IAM role must allow logs:CreateLogStream / PutLogEvents
aws ec2 create-flow-logs \
--resource-type VPC \
--resource-ids vpc-0abc123 \
--traffic-type ALL \
--log-destination-type cloud-watch-logs \
--log-group-name /aws/vpc/flowlogs/prod

Use ALL, ACCEPT, or REJECT for traffic type. REJECT-only logging reduces volume when you only care about denies.

1. “Connection refused” vs “silent timeout”

Section titled “1. “Connection refused” vs “silent timeout””
  • REJECT with TCP and relevant ports often correlates with NACL deny or no listener (SYN might still appear on the instance capture).
  • No flow log line for expected traffic may mean wrong subnet scope, logging not on that ENI, or traffic never reached the VPC (client-side, internet path, or peer VPC not logged).

Correlate with security groups and NACLs: SG is stateful; NACL is stateless and evaluated in order. A common mistake is return traffic blocked by NACL because only inbound ephemeral ports were opened one way.

Traffic enters one path and returns another; firewalls or NAT may see only half the flow. Symptoms: intermittent failures, partial TCP setup. Flow logs plus route tables and Transit Gateway / peering diagrams help. Compare source/dest and bytes both directions.

Large UDP payloads or certain TCP paths may black-hole when MTU mismatches. Flow logs show bytes/packets; ICMP “fragmentation needed” often requires host capture or Reachability Analyzer — flow logs alone may be insufficient.

For peering or Transit Gateway, ensure flow logs exist on both sides of the conversation (or on TGW attachments where supported). See VPC Connectivity for topology.

Assume logs are in a log group with flow records as JSON or space-delimited (parse accordingly). Adjust field names to your log format version.

Top rejected destination IPs (last hour):

fields @timestamp, srcAddr, dstAddr, dstPort, action
| filter action = "REJECT"
| stats count() as rejects by dstAddr, dstPort
| sort rejects desc
| limit 20

Traffic between two endpoints:

fields @timestamp, srcAddr, dstAddr, srcPort, dstPort, protocol, action, bytes
| filter srcAddr = "10.0.1.50" and dstAddr = "10.0.2.100"
| sort @timestamp asc

TCP REJECT on port 443:

fields @timestamp, srcAddr, dstAddr, action
| filter action = "REJECT" and dstPort = 443 and protocol = 6
| sort @timestamp desc
| limit 100

Protocol numbers: 6 = TCP, 17 = UDP, 1 = ICMP (verify against your format).

For S3 delivery, create a table over partitioned data (e.g. by account-id, region, date) and run SQL:

-- Illustrative — match your partition columns and serde
SELECT srcaddr, dstaddr, dstport, action, SUM(bytes) AS total_bytes
FROM vpc_flow_logs
WHERE date = '2026-03-30' AND action = 'REJECT'
GROUP BY 1, 2, 3, 4
ORDER BY total_bytes DESC
LIMIT 50;

Use the Athena documentation for the correct DDL and SerDe for your format.

  1. Confirm logging scope — VPC vs subnet vs ENI; is the failing path covered?
  2. Filter REJECT (or ALL) for the time window of the incident.
  3. Identify src/dst IPs and ports — map to services and security groups.
  4. Compare with SG rules — allowed paths, self-referencing SGs, referenced CIDRs.
  5. Check NACLs if SG looks correct — especially return traffic and ephemeral ports.
  6. Check routes — NAT Gateway, IGW, TGW, peering, blackhole routes.
  7. Escalate to packet capture on an instance if you need TCP flags, TLS handshakes, or payload-adjacent evidence — see Packet capture.

VPC Reachability Analyzer models whether traffic can flow between a source and destination (ENI, subnet, resource) on a given protocol/port, given current security groups, NACLs, and route tables. It answers “is this path allowed by configuration?” — not “what did happen on the wire.”

ToolBest for
Reachability AnalyzerQuick what-if on SG/NACL/route without capturing packets.
Flow logsHistorical ACCEPT/REJECT and volume for real traffic.
Packet captureTCP/TLS behavior, retransmits, and payload-adjacent debugging on a host.

Use reachability early when flow logs are inconclusive or you suspect a single missing rule.

  • Volume scales with traffic; REJECT-only or sampled configurations reduce cost where appropriate.
  • Retention in CloudWatch costs more than cold S3; align with compliance needs.
  • PII — IPs and ports can still be sensitive; restrict log access with IAM and encryption (KMS).
NeedTool
Who talked to whom, allowed vs denied, bytesVPC Flow Logs
SQL at scale on historical logsS3 + Athena
Fast interactive queriesCloudWatch Logs Insights
Threat correlationGuardDuty + flow logs
TCP handshake, TLS, payload hintstcpdump / Wireshark on host

Flow logs answer whether traffic was seen and permitted or denied at the VPC boundary; they work best alongside networking knowledge and observability metrics for full RCA.