Site Reliability Engineering: Measuring and Managing Reliability

This course teaches the theory of Service Level Objectives (SLOs), a principled way of describing and measuring the desired reliability of a service. Upon completion, learners should be able to apply these principles to develop the first SLOs for services they are familiar with in their own organizations. Learners will also learn how to use Service Level Indicators (SLIs) to quantify reliability and Error Budgets to drive business decisions around engineering for greater reliability. The learner will understand the components of a meaningful SLI and walk through the process of developing SLIs and SLOs for an example service.
Course info
Rating
(36)
Level
Advanced
Updated
Jan 23, 2020
Duration
2h 39m
Table of contents
Introduction to SRE
Targeting Reliability
Operating for Reliability
Choosing a Good SLI
Developing SLOs and SLIs
Quantifying Risks to SLOs
Consequences of SLO Misses
Description
Course info
Rating
(36)
Level
Advanced
Updated
Jan 23, 2020
Duration
2h 39m
Description

This course teaches the theory of Service Level Objectives (SLOs), a principled way of describing and measuring the desired reliability of a service. Upon completion, learners should be able to apply these principles to develop the first SLOs for services they are familiar with in their own organizations. Learners will also learn how to use Service Level Indicators (SLIs) to quantify reliability and Error Budgets to drive business decisions around engineering for greater reliability. The learner will understand the components of a meaningful SLI and walk through the process of developing SLIs and SLOs for an example service.

About the author
About the author

Build, innovate, and scale with Google Cloud Platform.

More from the author
Innovating with data and Google Cloud
Beginner
1h 24m
Jun 11, 2021
More courses by Google Cloud