Skip to content

Architecture

First PublishedLast UpdatedByAtif Alam

A Kubernetes cluster has two layers: the control plane (the brain) and worker nodes (where your containers actually run).

The control plane makes global decisions (scheduling, detecting failures, responding to events). It usually runs on dedicated nodes.

  • API Server (kube-apiserver) — The front door. Every kubectl command, every internal component, and every external integration talks to the cluster through the API server (REST over HTTPS).
  • etcd — A distributed key-value store that holds all cluster state (what pods exist, what nodes are available, config). The single source of truth.
  • Scheduler (kube-scheduler) — Watches for newly created pods with no assigned node and picks the best node based on resource requirements, affinity rules, and constraints. For predicate vs priority mechanics and debugging, see Scheduling and placement.
  • Controller Manager (kube-controller-manager) — Runs a set of controllers (loops) that watch cluster state and make changes to move toward the desired state. Examples: ReplicaSet controller (ensures the right number of pods), Node controller (detects when a node goes down).

Each worker node runs your application containers and reports back to the control plane.

  • kubelet — An agent on every node. It receives pod specs from the API server and ensures the described containers are running and healthy.
  • kube-proxy — Manages network rules on each node so that traffic to a Service reaches the right pods (via iptables or IPVS).
  • Container Runtime — The software that actually runs containers (e.g. containerd, CRI-O). Kubernetes talks to it through the Container Runtime Interface (CRI).
+---------------------------+
| Control Plane |
| |
kubectl --------> | API Server |
| | |
| Scheduler Controller |
| | Manager |
| etcd (cluster state) |
+---------------------------+
|
+----------------+----------------+
| |
+--------v--------+ +----------v------+
| Worker Node 1 | | Worker Node 2 |
| | | |
| kubelet | | kubelet |
| kube-proxy | | kube-proxy |
| container | | container |
| runtime | | runtime |
| | | |
| [Pod] [Pod] | | [Pod] [Pod] |
+------------------+ +------------------+
  1. You run kubectl apply -f deployment.yaml.
  2. API Server validates and stores the desired state in etcd.
  3. Scheduler notices unscheduled pods and assigns them to nodes.
  4. kubelet on the chosen node pulls the container image and starts the pod.
  5. Controllers continuously reconcile: if a pod crashes, the ReplicaSet controller creates a replacement.

API request lifecycle (kubectl apply → running Pod)

Section titled “API request lifecycle (kubectl apply → running Pod)”

A simplified path through the API server (details vary by verb and resource):

  1. Authentication — client certificate, bearer token, or exec plugin (OIDC, cloud IAM) proves identity.
  2. Authorization — RBAC (and optional authorization webhooks) allow or deny the verb on the resource.
  3. Mutating admission — built-in defaults and mutating webhooks may patch the object (for example inject sidecars).
  4. Validating admission — OpenAPI validation and validating webhooks; failures reject the request.
  5. Persist to etcd — desired state is stored; watches notify controllers.
  6. Controllers reconcile — for example Deployment → ReplicaSet → Pod objects.
  7. Scheduler — assigns nodeName when a Pod is pending scheduling.
  8. kubelet + CRI — pulls images, starts containers, reports Ready and probe status upstream.

See Admission controllers for webhook failurePolicy, timeouts, and debugging.

CheckCommand or signal
API readinesskubectl get --raw '/readyz?verbose'
End-to-end authkubectl auth whoami and a trivial CRUD (ConfigMap) in a non-prod namespace
etcd (managed)Provider health dashboards; self-managed: member list, etcd_server_has_leader, disk sync latency
SchedulerPending test pod receives a Scheduled event; scheduler logs without leader errors
Controller managerDeployment ReplicaSet ownership and steady Available replicas

For etcd symptoms, snapshots, and control-plane certificate rotation, see etcd and control plane health.

  • The control plane is the decision-maker; worker nodes do the actual work.
  • All state lives in etcd; losing etcd means losing cluster state (back it up).
  • The API server is the single point of communication — everything goes through it.
  • Kubernetes is declarative: you describe what you want, and the system converges toward that state.