Skip to content

Etcd and Control Plane Health

First PublishedLast UpdatedByAtif Alam

etcd is the Kubernetes control plane’s source of truth for API objects. The API server is the only component that talks to etcd directly for normal reads/writes; controllers and the scheduler watch the API.

  • API latency spikes and timeouts on writes or large list/watch operations.
  • Stale watches — controllers stop reconciling promptly; kubectl may show surprising delays.
  • Leader election flaps for controller-manager and scheduler if the API path to etcd is unstable.
  • In the worst case, loss of quorum — the API becomes read-only or unavailable depending on failure mode.
  • Take consistent snapshots on a schedule supported by your distro (often snapshot API + defrag policy from vendor docs).
  • Quorum loss is an emergency — restore from backup only with a tested runbook; never “guess” at etcd data files.

Raft, Consensus, and What Split-Brain Means

Section titled “Raft, Consensus, and What Split-Brain Means”

Kubernetes stores API objects in etcd, which implements Raft consensus across (typically) three or five members:

  • Leader election — one member accepts writes; followers replicate the log. If the leader fails, followers elect a new leader after a timeout.
  • Quorum — a majority of members must agree for a write to commit. With five members, three failures still lose quorum; odd counts avoid ties in “who is the majority?”
  • Split-brain (informal) — in generic distributed systems, people mean two partitions each thinking they are authoritative. etcd avoids two live writers for the same cluster: without a majority, the minority partition becomes read-only or unavailable for writes rather than accepting divergent state.
  • Real-world danger is not “magic split-brain” but operations mistakes: restoring an old backup into a live cluster, forking membership, or losing quorum during unplanned net partitions — follow vendor restore procedures exactly.

On managed clusters (for example EKS), you do not operate etcd members, but API latency, watch storms, and large objects still stress the same control-plane path. For in-cluster API metrics and watch churn, see EKS troubleshooting cheat sheet — Symptom 8: API Server / Control Plane Slow.

Proving control-plane components are healthy

Section titled “Proving control-plane components are healthy”
ComponentPractical signal
API serverkubectl get --raw /readyz?verbose and /livez; successful CRUD on a dummy ConfigMap
etcdManaged: cloud health; self-managed: member list, metrics (etcd_server_has_leader, disk fsync latency)
SchedulerPending pods get Scheduled events; scheduler logs without leader errors
Controller managerReplica counts match Deployments; Node lifecycle works; logs show stable leadership

Always follow your vendor runbook (kubeadm, OpenShift, EKS control plane, etc.). General pattern:

  1. Backup etcd (if you own it) and export a known-good kubeconfig.
  2. Rotate apiserver → kubelet and kubelet client certs in the documented order for your stack.
  3. Restart control plane static pods or systemd units as required; watch node connectivity.
  4. Re-distribute kubeconfigs to admins if client CA or front-proxy certs changed.

For kubeconfig TLS troubleshooting on the client side, see Kubeconfig and authentication.