Reliability & observability
Lessons on SLOs, monitoring, and incident response from running production services at scale.
This is the public home of the GovTech GTO Operations Engineering Practice. Here we share how we keep government services reliable and scalable — operational lessons and open resources for teams across the industry.
Reliability & observability
Lessons on SLOs, monitoring, and incident response from running production services at scale.
Infrastructure & automation
Infrastructure-as-code, cloud operations, and the automation that keeps environments consistent and repeatable.