Skip to content

Multi-Cluster Management

First PublishedLast UpdatedByAtif Alam

Multi-cluster is a deliberate choice about blast radius, tenancy, regionality, and upgrade independence. It is not “we failed to standardize one cluster” — regulated teams, edge footprints, and large enterprises often require multiple API servers.

This page names the main SME-level tools and patterns. For tenant isolation inside one cluster, see Multi-tenancy and policy. For delivery mechanics (repos, sync, secrets), see GitOps — especially ApplicationSet-style patterns that render the same chart into many clusters.

  • Failure isolation — a bad CRD/webhook upgrade or etcd incident in cluster A does not take cluster B offline.
  • Version skew experiments — canary a minor Kubernetes upgrade on a small cluster before the fleet.
  • Data residency — separate clusters per jurisdiction with no shared control plane.
  • Hard tenancy — some business units need cluster-admin-like freedom; separate clusters avoid impossible policy on a shared plane.

Costs include duplicated add-ons (CNI, DNS, ingress), identity integration per cluster, and observability correlation across boundaries.

Cluster API (CAPI) is a Kubernetes-style API for creating, upgrading, and deleting clusters themselves — usually by driving cloud provider machine APIs from a management cluster.

Common framing: “We treat clusters like cattle — Terraform or CAPI provisions the management plane, then we bootstrap workloads with GitOps.” Pair CAPI with your image / SBOM pipeline and Cluster upgrades discipline.

Karmada (placement across existing clusters)

Section titled “Karmada (placement across existing clusters)”

Karmada (Kubernetes Armada) focuses on propagating workloads and policies to member clusters — scheduling and override semantics across a fleet from a host control plane.

Contrast with CAPI: CAPI stands up clusters; Karmada distributes work onto clusters that already exist. Teams sometimes combine both: CAPI for birth, Karmada (or another layer) for ongoing placement.

The common production pattern is one Git repo (or monorepo path) per concern and a GitOps controller per cluster (Argo CD, Flux) — or one Argo CD controlling many registered clusters — plus ApplicationSet generators to fan out Applications from cluster metadata, environments, or folder layout.

This is still multi-cluster even without Karmada: Git is the source of truth; each cluster reconciles its slice. See GitOps for the core workflow; treat secrets, drift, and sync waves as fleet-wide risks.

Vendors expose hosted fleet UIs/APIs (attach clusters, run policy packs, aggregate metrics). Mentally they sit above GitOps or beside it — useful for inventory and policy baselines, but your desired state should still live in versioned manifests unless you accept UI-only drift.