Reliable Cloud Infrastructure - Design and Process
Through a combination of presentations, challenges, and hands-on labs, participants learn to design GCP deployments that are highly reliable and secure; and how to operate GCP deployments in a highly available and cost-effective manner.
What you'll learn
This online course equips students to build highly reliable and efficient solutions on Google Cloud Platform, using proven design patterns and principles derived from Google Site Reliability Engineering (SRE). It is a continuation of the Architecting with Google Cloud Platform courses and assumes hands-on experience with the technologies covered in the other courses. Through a combination of presentations, challenges, and hands-on labs, participants learn to design GCP deployments that are highly reliable and secure; and how to operate GCP deployments in a highly available and cost-effective manner. This course teaches participants the following skills: design for high availability, scalability, and maintainability, assess tradeoffs and make sound choices among Google Cloud Platform products, integrate on-premises and cloud resources, identify ways to optimize resources and minimize cost, implement processes that minimize downtime, such as monitoring and alarming, unit and integration testing, production resilience testing, and incident post-mortem analysis, implement policies that minimize security risks, such as auditing, separation of duties and least privilege, and implement technologies and processes that assure business continuity in the event of a disaster.
Table of contents
- Defining the Service:Course Overview 2m
- Defining the Service:Overview 9m
- Defining the Service:State and Solution 6m
- Defining the Service:Measurement 12m
- Defining the Service:Gathering Requirements 5m
- Introducing an example Photo Application service 4m
- Lab Intro-Deployment Manager 1m
- Getting started with GCP and Qwiklabs 4m
- Deployment Manager: Beginning appserver 0m
- Design for Resiliency, Scalability, and Disaster Recovery Overview 2m
- Design for Resiliency, Scalability, and Disaster Recovery:Failure Due to Loss 5m
- Design for Resiliency, Scalability, and Disaster Recovery:Failure Due to Overload 6m
- Design for Resiliency, Scalability, and Disaster Recovery:Coping with Failure 5m
- Design for Resiliency, Scalability, and Disaster Recovery:Business Continuity and Disaster Recovery 7m
- Design for Resiliency, Scalability, and Disaster Recovery:Scalable and Resilient Design 9m
- Out of Service! 9m
- Design Challenge 4:Redesign for Time 4m
- Design for Security:Overview 2m
- Design for Security:Cloud Security 2m
- Design for Security:Network Access Control and Firewalls 4m
- Design for Security:Protections Against Denial of Service 3m
- Design for Security:Resource Sharing and Isolation 6m
- Design for Security:Data Encryption and Key Management 3m
- Design for Security:Identity Access and Auditing 4m
- Photo service:Intentional Attack 6m
- Design Challenge 5:Defense In Depth 2m
- Deployment, Monitoring and Alerting, and Incident Response:Overview 2m
- Deployment, Monitoring and Alerting, and Incident Response:Deployment 2m
- Deployment, Monitoring and Alerting, and Incident Response:Monitoring and Alerting 12m
- Deployment, Monitoring and Alerting, and Incident Response:Incident Response 10m
- Stabilization and operation 1m
- Design Challenge 7:Monitoring and Alerting 3m
- Deployment Manager - Full Production 2m
- Deployment Manager: Full Production + (Stackdriver) 0m