Site Reliability Engineering (SRE): The Big Picture

by Elton Stoneman

Site Reliability Engineering (SRE) is how Google runs production systems, promoting high availability with high velocity and removing operational toil. It achieves the same goals as DevOps without the culture shift.

What you'll learn

Site Reliability Engineering (SRE) is a set of principles and practices that supports software delivery - keeping production systems stable and still delivering new features at speed. In this course, Site Reliability Engineering (SRE): The Big Picture, you'll get a thorough overview of how SRE works and why it's a good choice for many organisations. First, you'll learn the differences between SRE, DevOps, and traditional operations. Next, you'll discover how engineering practices help to reduce toil and provide more time to focus on high value tasks. Finally, you'll learn how SRE approaches monitoring and alerting, and about the SRE approach to managing incidents. When you're finished with this course, you'll be able to evaluate SRE and see if it's a good fit for your organisation.

About the author

Elton is a 10-time Microsoft MVP, author, trainer and speaker. He spent most of his career as a consultant working in Microsoft technologies, architecting and delivering complex solutions for industry leaders. He has delivered APIs on Azure serving millions of clients daily, Big Data solutions processing billions of events weekly, and cutting-edge solutions powered by containers. Elton's experience with .NET goes from .NET 1.0 running on Windows Server, right up to .NET Core running on Linux. Wh... more

Ready to upskill? Get started