Applying SRE principles to CI/CD
The automation of building & testing code with CI/CD enables us to ship code frequently with a high level of trust that bugs won’t impact end-users. Why then are our CI/CD systems still often painfully slow, unreliable & our ability to deliver frequently blocked?
Site Reliability Engineering (SRE) aims to reduce the pain caused by unhealthy platforms & processes that affect the reliability & stability of production systems. Join Buildkite’s Mel Kaulfuss as she looks at CI/CD through the SRE lens.
Learn how to bring SRE principles and practices to CI/CD, including:
-
Defining meaningful SLOs (service-level objectives) & SLIs (service-level indicators)
-
Observing system performance & metrics
-
Using error budgets to tune your test suites & pipelines
By managing your CI/CD infrastructure & processes as you would your production systems, with an SRE mindset, you’ll be able to respond quickly when things go wrong & reclaim control over large, slow and unreliable build and deploy processes.