Featured resource
2026 Tech Forecast
2026 Tech Forecast

Stay ahead of what’s next in tech with predictions from 1,500+ business leaders, insiders, and Pluralsight Authors.

Get these insights
  • Course

Culturing Resiliency with Data: A Taxonomy of Outages

This talk provides an overview of the categorization of outages that happened in Uber in the past few years based on root cause types.

Intermediate
29m
(6)

Created by Gremlin

Last Updated Jan 25, 2024

Course Thumbnail
  • Course

Culturing Resiliency with Data: A Taxonomy of Outages

This talk provides an overview of the categorization of outages that happened in Uber in the past few years based on root cause types.

Intermediate
29m
(6)

Created by Gremlin

Last Updated Jan 25, 2024

Get started today

Access this course and other top-rated tech content with one of our business plans.

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

This course is included in the libraries shown below:

  • Core Tech
What you'll learn

This talk provides an overview of the categorization of outages that happened in Uber in the past few years based on root cause types. We'll start with some background information, including definitions, incident management framework, and existing preventive techniques, aka best practices. Followed by details and rationale around individual categories, sub-categories, and their relative distribution. Then we'll deep dive into two of the biggest categories: deployment and capacity with a focus on time series based data ming techniques to assist detection and simulation of some of the common root causes. Finally, we'll discuss the propagation of lessons learned in terms of policy and process changes based on these insights.

Culturing Resiliency with Data: A Taxonomy of Outages
Intermediate
29m
(6)
Table of contents

About the author
Gremlin - Pluralsight course - Culturing Resiliency with Data: A Taxonomy of Outages
Gremlin
32 courses 3.7 author rating 18 ratings

Gremlin's enterprise Chaos Engineering platform makes it easy to build more reliable applications in order to prevent outages, innovate faster, and earn customer trust.

Get started with Pluralsight