From Dockerfile to Production on Kubernetes
This page is a sequential checklist for platform, SRE, and application teams taking a containerized service from a first Dockerfile to a production-ready deployment on Kubernetes. Each phase states the outcome you want and links to deeper library guides for manifests, Helm, GitOps, probes, and observability.
Hands-on YAML and runbooks live in linked pages and sibling examples under Kubernetes Examples — for example Deploy httpbin on k3s, Base workloads with Istio and Argo CD, and Installing Prometheus and Grafana on k3s.
Prerequisites
Section titled “Prerequisites”- A Kubernetes cluster you can reach with
kubectl(k3s, EKS, or similar). - A container runtime and registry you can push to.
- Optional: Helm and an in-cluster GitOps controller (Argo CD or Flux) for later phases.
1. Harden the Dockerfile
Section titled “1. Harden the Dockerfile”Outcome: A small, reproducible image that runs safely and shuts down cleanly.
- Use a multi-stage build so compilers and build tools stay out of the runtime image.
- Run as a non-root user; prefer a pinned base tag (
slimor distroless where it fits your stack). - Keep layers minimal; use a
.dockerignoreso build context stays small. - Use exec form for
CMD/ENTRYPOINT(ortinias PID 1) so SIGTERM reaches your process for graceful shutdown.
Example files
Section titled “Example files”A minimal repo layout:
.├── Dockerfile├── .dockerignore└── app/ # source or build inputs for your stackDockerfile — multi-stage, pinned runtime base, non-root user, tini + exec-form start command (swap the build stage for your language or compiler image):
# syntax=docker/dockerfile:1
# ---- build: compilers and package managers stay here ----FROM debian:bookworm-slim AS buildWORKDIR /srcRUN apt-get update && apt-get install -y --no-install-recommends \ ca-certificates build-essential \ && rm -rf /var/lib/apt/lists/*COPY . .# Example: produce a single static binary at /out/server (adjust for your stack)RUN make build OUT=/out/server
# ---- runtime: small image, no build tools ----FROM debian:bookworm-slim# In production, pin by digest, for example:# FROM debian:bookworm-slim@sha256:<digest-from-docker-buildx-imagetools-inspect>
RUN apt-get update && apt-get install -y --no-install-recommends \ ca-certificates tini \ && rm -rf /var/lib/apt/lists/* \ && groupadd --gid 1000 app \ && useradd --uid 1000 --gid app --shell /usr/sbin/nologin --create-home app
WORKDIR /appCOPY --from=build --chown=app:app /out/server ./server
USER appEXPOSE 8080
# exec form: tini forwards SIGTERM to the app processENTRYPOINT ["/usr/bin/tini", "--"]CMD ["./server"]Replace the build stage with your real compile or install steps (Go, Node, Python pip install, Rust, and so on). The runtime stage should only copy artifacts and run as non-root.
.dockerignore — keep build context small and out of the image:
.git.github.env.env.**.mdDockerfile*docker-compose*.yml**/__pycache__**/.pytest_cache**/node_modules**/dist**/target**/.venv*.logDeeper reading: Docker best practices, Images and registries.
2. Make the app cloud-native
Section titled “2. Make the app cloud-native”Outcome: The process behaves well on a scheduler: configurable, observable at boot, and safe under rollouts.
- Read configuration from environment variables (12-factor); avoid baking environment-specific settings into the image.
- Emit structured JSON logs to stdout / stderr for the platform to collect.
- On SIGTERM, drain in-flight work and exit within the pod termination grace period.
- Expose
/healthz(liveness) and/readyz(readiness) — readiness should fail when the app cannot safely take traffic.
Deeper reading: Production patterns — health checks, REST HTTP Fundamentals, Observability.
3. Observability hooks
Section titled “3. Observability hooks”Outcome: Operators can correlate requests, scrape metrics, and trace across services.
- Expose a Prometheus
/metricsendpoint (or a dedicated metrics port) with stable metric names. - Propagate trace context (OpenTelemetry) on inbound and outbound calls.
- Include a correlation or trace ID in structured logs and echo it on responses where appropriate.
Deeper reading: Prometheus, OpenTelemetry, Observability for systems.
4. Image registry and tagging
Section titled “4. Image registry and tagging”Outcome: Every deploy references an immutable, scannable image in a registry you control.
- Push to a private registry (ECR, GCR, GHCR, or equivalent).
- Tag images with the git SHA (and optionally semver); avoid relying on
latestalone in production. - Run Trivy or Grype in your Makefile or CI and fail builds on policy violations you care about.
Deeper reading: Images and registries, ECR, Supply chain security, Security scanning.
5. Kubernetes manifests
Section titled “5. Kubernetes manifests”Outcome: A declarative baseline workload with guardrails for scheduling, health, and disruption.
- Define Deployment, Service, ConfigMap, Secret, and ServiceAccount as needed.
- Set requests and limits for CPU and memory; wire liveness, readiness, and startup probes to your health endpoints.
- Harden the pod:
securityContext.runAsNonRoot,readOnlyRootFilesystemwhere possible, drop unnecessary capabilities. - Add a PodDisruptionBudget so voluntary disruptions (node drains, cluster upgrades) respect minimum availability.
Deeper reading: Manifests, Core objects, Production patterns, Pod security standards.
6. Package with Helm or Kustomize
Section titled “6. Package with Helm or Kustomize”Outcome: One chart or overlay set parameterized for dev, stage, and prod instead of copied YAML trees.
- Use Helm values (or Kustomize overlays) for replica counts, image tags, resource sizes, and feature flags.
- Prefer Helm when you version and release the application as a unit; use Kustomize when you mainly patch shared bases per environment.
- Keep secrets out of values files committed to Git — reference secret names or use external secret tooling (phase 8).
Deeper reading: Helm, Helm templating, Helm, operators, and GitOps.
7. Ingress, TLS, and network policy
Section titled “7. Ingress, TLS, and network policy”Outcome: North-south traffic is encrypted and named; east-west traffic follows least privilege.
- Expose the Service via Ingress or Gateway API with a clear host and path rules.
- Automate TLS with cert-manager (or your cloud load balancer’s managed certificates).
- Restrict pod-to-pod traffic with NetworkPolicy (default deny + explicit allow lists for ingress/egress).
Deeper reading: Ingress controllers, Network policies, TLS and certificates.
8. Secrets management
Section titled “8. Secrets management”Outcome: Credentials never live in images or plain-text Git; rotation has a defined path.
- Do not bake secrets into the image or commit them to Git as plain Secret manifests.
- Use External Secrets Operator, Sealed Secrets, or your cloud secret manager (SSM, Secrets Manager, Key Vault) to materialize Kubernetes Secrets at runtime.
- Separate deploy-time config (Helm values, ConfigMaps) from runtime credentials.
Deeper reading: Helm templating — External Secrets Operator, AWS secrets, Operators.
9. CI/CD and GitOps
Section titled “9. CI/CD and GitOps”Outcome: Every production change is built, tested, scanned, published, and reconciled from a known source.
- Pipeline stages: build → test → scan image → push → deploy (or hand off to GitOps).
- Wire GitHub Actions (or your CI) to produce immutable image tags and update deployment manifests or Helm values.
- Prefer GitOps (Argo CD or Flux) so the cluster pulls desired state from Git — audit trail and rollback via
git revert. - Contrast imperative
kubectl apply(see Deploy httpbin on k3s) with continuous reconciliation in GitOps.
Deeper reading: GitHub Actions, GitOps, Pipeline fundamentals.
10. Autoscaling and reliability
Section titled “10. Autoscaling and reliability”Outcome: Load and maintenance events scale capacity safely; reliability targets exist before go-live.
- Add HPA on CPU, memory, or custom metrics (request rate, queue depth) — see Autoscaling on EKS and lab HPA on k3s.
- Plan node capacity (cluster autoscaler, Karpenter, or managed node groups) so pending pods can schedule.
- Define SLOs and error budgets before launch; align dashboards and alerts with the service readiness checklist.
Deeper reading: SLOs, SLIs, and error budgets, Production platform checklist, Production scenarios.
Further reading
Section titled “Further reading”| Topic | Library page |
|---|---|
| Platform layers and ownership | Production platform checklist |
| Manifest structure | Manifests |
| Probes, limits, PDB, rollouts | Production patterns |
| Autoscaling (HPA, nodes, metrics) | Autoscaling on EKS |
| GitOps and repo layout | GitOps |
| Example deploys | Kubernetes Examples |