Skip to content

Practices Overview

First PublishedByAtif Alam

This section holds practices content — the human, process, and tooling patterns that surround the technical work covered elsewhere in the library. The pages here are written for senior engineers on infrastructure, platform, and SRE teams; titles and ladder levels vary by org but the practices generalize.

The library does not replace your organization’s training, ladder rubrics, or vendor-specific playbooks. It connects practice patterns to the rest of this site’s content (CI/CD, observability, Kubernetes, QA, AIOps).

  • Leadership and Mentoring — Mentoring structures, coaching debugging methodology, feedback patterns, calibrating technical judgment, roadmap influence, and resolving cross-team prioritization conflicts.
  • Agile for SRE and Platform Work — Scrum and Kanban applied to interrupt-driven platform work, sprint commitments alongside on-call, toil budgets, ceremonies that help vs ceremony theater, and Definition of Done for infrastructure changes.
  • Incident Tooling and Customer Communications — On-call schedules, escalation policies, status pages (internal vs external), severity-driven customer comms templates, and stakeholder updates during long incidents. Pattern-first; vendor-second.
TopicWhere to Go
Reliability and qualityQA, QA reliability guide
Incident command and postmortemsIncident response and on-call
Pipeline guardrails for platform teamsCI/CD best practices
Service readiness before productionService readiness checklist