Networking

First PublishedFeb 16, 2026Last UpdatedMay 1, 2026ByAtif Alam

Every pod in Kubernetes gets its own IP address. Networking ties pods together and exposes them to the outside world.

The Networking Model

Kubernetes has three rules:

Every pod can communicate with every other pod (no NAT).
Nodes can communicate with all pods (and vice versa).
The IP a pod sees for itself is the same IP others see for it.

This flat network means you don’t need to map ports between containers. A CNI plugin implements these rules and assigns the pod network interface. The sections below compare common CNIs; for AWS VPC CNI specifics (IP exhaustion, ENIs), see EKS troubleshooting cheat sheet.

CNI plugins compared (Calico, Cilium, Flannel)

CNI	Datapath focus	NetworkPolicy	kube-proxy replacement	Typical fit
Flannel	Simple overlay (often VXLAN)	No built-in enforcement — pair with another policy layer if you need rules	Usually no — kube-proxy still programs Services	Small clusters, labs, “just get pod IP working”
Calico	BGP or overlay options; mature ecosystem	Yes — broad L3/L4 policy; policy-only mode can sit beside cloud CNIs (for example on EKS)	Optional — can offload some Service paths	Teams needing policy without committing to full eBPF datapath
Cilium	eBPF-heavy datapath	Yes — L3/L4 and L7 visibility where enabled	Often yes — can replace kube-proxy for Services	Platforms wanting observability, Hubble, service mesh–like features at CNI layer

Rule of thumb: Flannel is enough when you trust every workload and only need L3 connectivity. Add Calico or Cilium when you need default-deny, egress controls, or eBPF-level introspection. Pick based on ops maturity and vendor support for your distro — not feature charts alone.

Pod-to-pod traffic (cross-node, generic)

Without cloud-specific ENI details, the mental model is:

1
Pod A (pod netns) -> veth -> node routing
2
             -> underlay (VPC, DC fabric, or overlay tunnel)
3
             -> remote node -> veth -> Pod B (pod netns)

Overlays (VXLAN, Geneve) encapsulate pod IPs inside node-to-node packets; underlays (for example cloud VPC routing to pod CIDRs) may expose pod IPs directly on the network. Either way, Services still provide stable virtual IPs — kube-proxy or a CNI datapath programs how ClusterIP maps to backends. For kube-proxy modes and EndpointSlices, see Services and endpoints (and the same page links here for the generic path above).

Services

Pods are ephemeral — they come and go. A Service provides a stable address that routes traffic to a set of pods matched by a label selector.

ClusterIP (Default)

Internal-only. Reachable from inside the cluster.

1
apiVersion: v1
2
kind: Service
3
metadata:
4
  name: my-app
5
spec:
6
  selector:
7
    app: my-app
8
  ports:
9
    - port: 80
10
      targetPort: 8080

Other pods reach this service at my-app:80 or my-app.default.svc.cluster.local:80.

EndpointSlices

EndpointSlices are the modern backing store for Service endpoints (replacing large single Endpoints objects at scale). The kube-proxy (or CNI datapath) watches slices and programs rules when pods become Ready/Unready or restart — that churn is a common cause of short 503 windows during rollouts if clients keep stale connections.

1
kubectl get endpointslices -n default -l kubernetes.io/service-name=my-app

See Services and endpoints for kube-proxy modes and debugging.

kube-proxy modes (datapath)

Mode	Notes
iptables	Common default; rule count grows with services and backends
ipvs	Hash-based dispatch; often better at very large service counts
eBPF / no kube-proxy	Cilium and similar program the datapath directly — rich observability

1
kubectl get cm kube-proxy-config -n kube-system -o yaml | grep mode

NodePort

Exposes the service on a static port on every node’s IP. Useful for development or when you don’t have a cloud load balancer.

1
spec:
2
  type: NodePort
3
  ports:
4
    - port: 80
5
      targetPort: 8080
6
      nodePort: 30080    # accessible at <NodeIP>:30080

LoadBalancer

Provisions an external load balancer (on cloud providers). Traffic from the internet reaches the LB, which forwards to the service.

1
spec:
2
  type: LoadBalancer
3
  ports:
4
    - port: 80
5
      targetPort: 8080

Which service type should I choose?

Use service types by exposure level and protocol needs:

Need	Recommended choice	Why
Internal service-to-service traffic	ClusterIP	Stable DNS inside cluster, no external exposure
Public HTTP/HTTPS app	Ingress/Gateway + ClusterIP backend	Centralized TLS/routing, fewer public entrypoints
Public non-HTTP (TCP/UDP) service	LoadBalancer	Direct external L4 access from cloud LB
Quick local/dev access	NodePort	Simple direct testing via `<NodeIP>:nodePort`

For most new production setups:

Create app services as ClusterIP.
Expose external web traffic through Ingress (or Gateway API).
Use LoadBalancer at the edge (often for the ingress controller), not per service unless required.
Keep NodePort mainly for development, labs, or specific infrastructure constraints.

Do I need both LoadBalancer and Ingress?

Often, yes for HTTP/HTTPS workloads.

Ingress defines Layer 7 routing rules (host/path/TLS behavior).
A LoadBalancer Service usually exposes the Ingress controller externally.

Typical flow:

1
Internet
2
  -> Cloud LoadBalancer (Service type: LoadBalancer for ingress controller)
3
  -> Ingress Controller
4
  -> ClusterIP Services
5
  -> Pods

Use only Service type: LoadBalancer (without Ingress) when you need simple direct exposure, especially for single-service or non-HTTP TCP/UDP use cases.

Ingress

An Ingress manages external HTTP/HTTPS access to services. It provides:

Host-based routing — api.example.com goes to one service, app.example.com to another.
Path-based routing — /api goes to the backend, / to the frontend.
TLS termination — HTTPS at the edge.

For HTTP semantics (status codes, headers, timeouts at the edge) and TLS termination patterns, see HTTP for Operators. For certificate lifecycle on AWS (ACM) and pointers to cert-manager, see TLS and Certificates and Operators (cert-manager).

1
apiVersion: networking.k8s.io/v1
2
kind: Ingress
3
metadata:
4
  name: my-ingress
5
spec:
6
  rules:
7
    - host: app.example.com
8
      http:
9
        paths:
10
          - path: /
11
            pathType: Prefix
12
            backend:
13
              service:
14
                name: frontend
15
                port:
16
                  number: 80
17
          - path: /api
18
            pathType: Prefix
19
            backend:
20
              service:
21
                name: backend
22
                port:
23
                  number: 80

An Ingress needs an Ingress Controller (e.g. NGINX Ingress Controller, Traefik) actually running in the cluster to work. For controller selection, TLS/mTLS patterns, and cert-manager wiring, see Ingress Controllers.

DNS

Kubernetes runs an internal DNS service (CoreDNS). Every Service gets a DNS name:

<service-name> — within the same namespace.
<service-name>.<namespace>.svc.cluster.local — fully qualified.

Pods can resolve service names automatically; no hardcoded IPs needed.

CoreDNS scaling and failure modes

Under load, DNS becomes a shared dependency for the whole cluster. What tends to break first:

Too few CoreDNS replicas for cluster QPS — elevated latency on every dependency that resolves names.
Upstream forwarder issues (corporate resolver, VPC DNS) — timeouts bubble up as app errors.
NodeLocal DNSCache misconfiguration — faster local cache when correct; black holes when not.
Large ndots / search list — amplifies query volume from musl/glibc resolvers.

Mitigations: right-size CoreDNS HPA, validate CoreDNS ConfigMap (forward plugin), consider NodeLocal for high-QPS clusters, and keep an eye on memory for caches.

Network Policies

By default, all pods can talk to all other pods. NetworkPolicies restrict traffic (like a firewall). They require a CNI plugin that supports them (e.g. Calico, Cilium).

1
apiVersion: networking.k8s.io/v1
2
kind: NetworkPolicy
3
metadata:
4
  name: allow-frontend-only
5
spec:
6
  podSelector:
7
    matchLabels:
8
      app: backend
9
  ingress:
10
    - from:
11
        - podSelector:
12
            matchLabels:
13
              app: frontend
14
      ports:
15
        - port: 8080

This policy says: only pods labeled app: frontend can reach pods labeled app: backend on port 8080.

Default deny, then allow (pattern)

For least privilege, start from deny all ingress for a namespace (or tenant), then add explicit NetworkPolicy resources for each allowed flow. Example deny-all ingress:

1
apiVersion: networking.k8s.io/v1
2
kind: NetworkPolicy
3
metadata:
4
  name: default-deny-ingress
5
  namespace: team-a
6
spec:
7
  podSelector: {}
8
  policyTypes: [Ingress]

Add additional policies per app to permit required sources and ports. See Network policies for metadata egress blocks and CNI notes.

AWS, DNS, Ingress, and Service Mesh

On AWS, Services type: LoadBalancer and Ingress (with the AWS Load Balancer Controller) create NLB or ALB resources. See Elastic Load Balancing and Route 53 for public DNS in front of those endpoints.

For L7 routing, retries, and mTLS inside the cluster, Istio (and other meshes) sit alongside Ingress. See Istio.

Gateway API vs Ingress

Both solve north–south HTTP(S) routing into the cluster, but they differ in API shape and ownership — not in “magic performance.”

Topic	Ingress (`Ingress`, `IngressClass`)	Gateway API (`Gateway`, `GatewayClass`, `HTTPRoute`, …)
Maturity	Ubiquitous; familiar to most platform teams	Newer; requires a controller that implements Gateway API (often the same vendor as your Ingress controller)
Roles	Annotations-heavy; host/path rules on one object	Route objects attach to a Gateway — clearer split between platform (Gateway, TLS, shared listener) and app teams (HTTPRoute)
Extensibility	Vendor-specific annotations for advanced behavior	Typed route resources and policy attachments (implementation-dependent)
Meshes	Classic Ingress Gateway pattern	Meshes (for example Istio) increasingly support Gateway API as a front door alongside or instead of `Ingress` — see Istio

When to stay on Ingress: existing controllers, charts, and team muscle already work; you only need host/path TLS termination.

When to adopt Gateway API: multi-team clusters where delegation (who owns listeners vs routes), TLS policy, and portable route objects reduce annotation sprawl — after your chosen controller is GA for the features you need.

Controller selection, TLS/mTLS, and AWS Load Balancer Controller patterns stay on Ingress controllers.

Network troubleshooting flow — Order of operations from symptom to mesh.
Scheduling and placement — When Pending is a placement or capacity problem, not pure L3/L4.
EKS — Control plane and worker patterns on AWS.