- Lab
- A Cloud Guru
Troubleshooting GKE Deployments
You have taken over as your company's top GKE deployment wizard, but your predecessor has left you only very brief notes on how to deploy any of the company's required applications and workloads. Through this lab, you will run some basic deployments on GKE and then troubleshoot the inevitable errors that occur, including `ErrImagePull` and `CrashLoopBackOff`.
Path Info
Table of Contents
-
Challenge
Create and Connect to a GKE Cluster
- From the GCP menu, select Kubernetes Engine
- Wait for the API to be enabled. Then click Create cluster.
- Under Node Pools on the left, click default-pool.
- Under Size, change the number of nodes to "1".
- In Node Pools > default-pool, click Nodes and change the machine type to an e2-small.
- Click Create to create the cluster. After a few minutes you will see a green tick that shows that your cluster is ready.
- Click Connect next to your cluster. Then under Command-line access, click Run in Cloud Shell.
- When the Cloud Shell terminal has spawned, hit Return to run the command and click Authorize when prompted. The rest of this lab's objectives will be completed in the Cloud Shell terminal using
kubectl
.
-
Challenge
Solve a CrashLoopBackOff Problem
Your predecessor left only these instructions to run a MySQL Pod using
kubectl
:kubectl run mysql --image=mysql
After a minute or so, if you check the Pod logs, they will show an unhealthy Pod.
What could be the cause of the problem?
If you check the Pod logs, you will see that the
mysql
container requires at least 1 environment variable in order to start up successfully.Delete the Pod and re-create it with a password for the MySQL server.
When you check the Pod logs, they should now show a Pod in the
Running
state.Delete this Pod before continuing to the next objective.
-
Challenge
Solve an ErrImagePull Problem
Your predecessor left you a note saying that you only use the latest cutting-edge version 3.0 of the NGINX web server. You suspect he is playing a joke on you. Nevertheless, you should attempt to create the Pod.
Quite quickly, if you check the Pod logs, they will show you that this pod can't run due to an
ErrImagePull
error. There is no version 3.0 of NGINX, so there is nonginx:3.0
container image to pull.You can fix this by editing the Pod and correcting the image to
nginx:latest
(atspec: containers: - image:
).To exit the editor, hit Esc to exit edit mode. Then quit and save from the editor.
Your NGINX Pod should shortly be up and running.
Delete this Pod before continuing to the next objective.
-
Challenge
Experience a Pending Pod Problem
For this step, we need to download the YAML file for our deployment. Run the following command to download the file:
curl -O https://raw.githubusercontent.com/ACloudGuru-Resources/Google-Cloud-Professional-Cloud-Developer/main/labs/troubleshooting-gke/nginx-deployment.yml
Using the downloaded deployment file, you first need to create an NGINX deployment and then scale up the deployment to 5 replicas.
After a minute or so, check on the replicas. You'll see that 1 or 2 of the pods are stuck in a
Pending
state.First, check to see if the deployment has run successfully. If there are no errors in the deployment event log, then you should go back to the logs output. Pick a pod in a
Pending
state to describe.In the event log, we'll see why this pod cannot be scheduled. There simply isn't enough available CPU in your cluster to schedule the extra pods. Looks like your predecessor was also joking about the recommended cluster sizing!
Increasing the node size or adding more nodes to the cluster will fix this problem.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.