Managing Cloud Resources Using Google Stackdriver

This course covers all important aspects of Stackdriver monitoring, which works across all Google Cloud Platform resources, making it convenient to set up uptime checks, profiling, and integration with other cloud platforms, and monitoring tools.
Course info
Level
Beginner
Updated
Jan 16, 2019
Duration
1h 35m
Table of contents
Description
Course info
Level
Beginner
Updated
Jan 16, 2019
Duration
1h 35m
Description

Stackdriver Monitoring is a powerful and versatile cloud monitoring tool that is tightly integrated with virtually every service on the Google Cloud Platform. You can significantly improve the performance and design of your architecture and simplify troubleshooting if you master the nuances of Stackdriver Monitoring. In this course, Managing Cloud Resources Using Google Stackdriver, you will gain the ability to monitor your cloud resources track both system and user-defined metrics and respond to alerts using Stackdriver Monitoring. First, you will learn Stackdriver concepts such as metrics, monitored resources, workspaces, and alerting policies. In this process, we will learn how to install the Stackdriver monitoring agent, and also when that agent is and is not required. Next, you will discover how to monitor third-party applications and work with custom metrics. We will create resources to monitor as well as metrics associated with those resources, then use the Metrics Explorer to create dashboards to keep track of those metrics. You will also configure uptime checks and alerts to notify you when resource health is not satisfactory. Stackdriver supports uptime checks in HTTP, HTTPS, and TCP. The probes sent by these checks are governed by VPC firewall rules, so those must be set up correctly as well. Finally, you will explore how to create checks for the absence of metrics, set variables in alerts, and explore incidents and events and integrate with third-party tools. Specifically, you will integrate Stackdriver Monitoring with OpsGenie, which is an alerting and incident management platform. You will round out the course by programmatically working with the Stackdriver Monitoring API from within Datalab python notebooks. When you’re finished with this course, you will have the skills and knowledge of Stackdriver Monitoring needed to monitor, troubleshoot, and analyze the usage of your cloud resources.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Analyzing Data with Qlik Sense
Intermediate
2h 11m
Jun 17, 2019
Using PyTorch in the Cloud: PyTorch Playbook
Intermediate
2h 21m
Apr 25, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and welcome to this course on Managing Cloud Resources Using Google Stackdriver. A little about myself, I have a masters in electrical engineering from Stanford and have worked at companies such as Microsoft, Google, and Flipkart. At Google I was one of the first engineers working on real-time collaborative editing in Google Docs, and I hold four patents for its underlying technologies. I currently work on my own startup, Loonycorn, a studio for high-quality video content. In this course, you will gain the ability to monitor your cloud resources, track both system and user-defined metrics, and respond to alerts using Stackdriver Monitoring. First, you will learn Stackdriver concepts such as metrics, monitored resources, workspaces, and alerting policies. In this process, we will learn how to install the Stackdriver Monitoring agent and the metrics you can track with this agent. Next, you'll discover how to monitor third-party applications and work with custom metrics. We will create resources to monitor, as well as metrics associated with these resources, and then use the Metrics Explorer to create dashboards to keep track of these metrics. You will also configure uptime checks and alerts to notify you when resource help is not satisfactory. Finally, you will explore how to create checks for the absence of metrics, how to set variables and alerts, explore incidents and events, and integrate with third-party tools. Specifically, you will integrate Stackdriver Monitoring with Opsgenie, which is an alerting and incident management platform. You'll round out the course by programmatically working with Stackdriver Monitoring APIs from within Datalab Python Notebooks. When you're finished with this course, you'll have the skills and knowledge of Stackdriver Monitoring needed to monitor, troubleshoot, and analyze the usage of your cloud resources.

Introducing Stackdriver Monitoring
Hi, and welcome to this course on Managing Cloud Resources Uses Google Stackdriver. Stackdriver encompasses an entire suite of tools for monitoring, logging, error reporting, tracing, and even debugging. However, in this course we'll be focusing on those set of tools that you'll probably be using the most often if you're working on the GCP, and that is Stackdriver Monitoring. We'll understand the various use cases of monitoring, and we'll see how Stackdriver monitoring provides visibility into the performance, uptime, and overall health of cloud-powered applications. We'll see a good sample of metrics that are automatically tracked by Stackdriver, We'll see how we can use Stackdriver to write our own custom metrics, which can then be visualized. We'll talk about the monitoring agent that you need to install on VMs that'll give you additional insights into the applications that you're running on your VMs. Stackdriver allows you to configure uptime checks and alerts, so that you're informed if any of your VMs or applications go down. We'll also see how you can set up dashboards to monitor important metrics and share these dashboards with your team.

Working with Advanced Monitoring Features
Welcome to this module where we'll see how we can work with Advanced Monitoring Features offered by Stackdriver. Stackdriver Monitoring allows you to monitor a whole host of third-party applications, Elasticsearch, Redis, RabbitMQ, Apache Web Server, the NginX server, all of these have plugins which you can configure and pipe metrics to Stackdriver. Like Spikey Sales engineers, if you're moving from an on-premise data center to the cloud, it's possible that you're already tracking custom metrics on your on-premise DC. You can feed these custom metrics to Stackdriver using its APIs and track all of your metrics in the same place. If you have multiple resources that need to be monitored as a single entity such as your Hadoop cluster or a group of VMs that serve your website traffic, you can do so using resource groups. In this module, we'll also dive deeper into configuring and managing different alerting policies. We'll see how you can explore and work with incidents and events that occur with the resources that you're tracking using Stackdriver. We'll also see how Stackdriver integrates with other monitoring platforms.

Monitoring Resources Using Cloud Datalab
Hi, and welcome to this module where we'll see how we can monitor our resources programmatically using Cloud Datalab. We've used the Google Cloud Monitoring client library in Python earlier, but that was to create a custom metric. In this module, we'll see how we can use these client libraries to programmatically access metrics and visualize them. We'll use Datalab to write our monitoring code. Datalab is a hosted Jupyter Notebook available on a GCP VM, which comes preinstalled and preconfigured with all of the tools that you'll need to explore and visualize your data. We'll access the Stackdriver API using a Python client to retrieve monitoring information, and we'll visualize our metrics using Matplotlib.