A Guide on Troubleshooting Kubernetes Deployments

A Guide on Troubleshooting Kubernetes Deployments

Troubleshooting in Kubernetes can be a daunting task if you don't know where to start. In this Troubleshooting Kubernetes tutorial you will learn how to diagnose problems in Pods, Services and Ingress. Why is the Pod pending? And why is it Running but can't receive any traffic?

TL;DR: here's a diagram to help you debug your deployments in Kubernetes (and you can download it in the PDF version here).

When you wish to deploy an application in Kubernetes, you usually define three components:

  • a Deployment — which is a recipe for creating copies of your application called Pods
  • a Service — an internal load balancer that routes the traffic to Pods
  • an Ingress — a description of how the traffic should flow from outside the cluster to your Service.

Here's a quick visual recap.

In Kubernetes your applications are exposed through two layers of load balancers: internal and external.


The internal load balancer is called Service, whereas the external one is called Ingress.


Pods are not deployed directly. Instead, the Deployment creates the Pods and whatches over them.

Assuming you wish to deploy a simple Hello World application, the YAML for such application should look similar to this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  labels:
    track: canary
spec:
  selector:
    matchLabels:
      any-name: my-app
  template:
    metadata:
      labels:
        any-name: my-app
    spec:
      containers:
      - name: cont1
        image: learnk8s/app:1.0.0
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    name: app
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - http:
    paths:
    - backend:
        serviceName: app
        servicePort: 80
      path: /

The definition is quite long, and it's easy to overlook how the components relate to each other.

For example:

  • When should you use port 80 and when port 8080?
  • Should you create a new port for every Service so that they don't clash?
  • Do label names matter? Should it be the same everywhere?

Before focusing on the debugging, let's recap how the three components link to each other.

Let's start with Deployment and Service.

Connecting Deployment and Service

The surprising news is that Service and Deployment aren't connected at all.

Instead, the Service points to the Pods directly and skips the Deployment altogether.

So what you should pay attention to is how the Pods and the Service are related to each other.

You should remember three things:

  1. The Service selector should match at least one label of the Pod
  2. The Service targetPort should match the containerPort of the container inside the Pod
  3. The Service port can be any number. Multiple Services can use the same port because they have different IP addresses assigned.

The following diagram summarises the how to connect the ports:

Consider the following Pod exposed by a Service.


When you create a Pod, you should define the port containerPort for each container in your Pods.


When you create a Service, you can define a port and a targetPort. But which one should you connect to the container?


targetPort and containerPort should always match.


If your container exposes port 3000, then the targetPort should match that number.

If you look at the YAML, the labels and ports/targetPort should match:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  labels:
    track: canary
spec:
  selector:
    matchLabels:
      any-name: my-app
  template:
    metadata:
      labels:
        any-name: my-app
    spec:
      containers:
      - name: cont1
        image: learnk8s/app:1.0.0
        ports:
        - containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    any-name: my-app

What about the track: canary label at the top of the Deployment?

Should that match too?

That label belongs to the deployment, and it's not used by the Service's selector to route traffic.

In other words, you can safely remove it or assign it a different value.

And what about the matchLabels selector?

It always has to match the Pod labels and it's used by the Deployment to track the Pods.

Assuming that you made the correct change, how do you test it?

You can check if the Pods have the right label with the following command:

kubectl get pods --show-labels

Or if you have Pods belonging to several applications:

kubectl get pods --selector any-name=my-app --show-labels

Where any-name=my-app is the label any-name: my-app.

Still having issues?

You can also connect to the Pod!

You can use the port-forward command in kubectl to connect to the Service and test the connection.

kubectl port-forward service/<service name> 3000:80

Where:

  • service/<service name> is the name of the service — in the current YAML is "my-service"
  • 3000 is the port that you wish to open on your computer
  • 80 is the port exposed by the Service in the port field

If you can connect, the setup is correct.

If you can't, you most likely misplaced a label or the port doesn't match.

Connecting Service and Ingress

The next step in exposing your app is to configure the Ingress.

The Ingress has to know how to retrieve the Service to then retrieve the Pods and route traffic to them.

The Ingress retrieves the right Service by name and port exposed.

Two things should match in the Ingress and Service:

  1. The servicePort of the Ingress should match the port of the Service
  2. The serviceName of the Ingress should match the name of the Service

The following diagram summarises how to connect the ports:
You already know that the Service expose a port.


The Ingress has a field called servicePort.


The Service port and the Ingress servicePort should always match.


If you decide to assign port 80 to the service, you should change servicePort to 80 too.

In practice, you should look at these lines:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  ports:
  - port: 80
    targetPort: 8080
  selector:
    any-name: my-app
---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: my-ingress
spec:
  rules:
  - http:
    paths:
    - backend:
        serviceName: my-service
        servicePort: 80
      path: /

How do you test that the Ingress works?

You can use the same strategy as before with kubectl port-forward, but instead of connecting to a service, you should connect to the Ingress controller.

First, retrieve the Pod name for the Ingress controller with:

kubectl get pods --all-namespaces
NAMESPACE   NAME                              READY STATUS
kube-system coredns-5644d7b6d9-jn7cq          1/1   Running
kube-system etcd-minikube                     1/1   Running
kube-system kube-apiserver-minikube           1/1   Running
kube-system kube-controller-manager-minikube  1/1   Running
kube-system kube-proxy-zvf2h                  1/1   Running
kube-system kube-scheduler-minikube           1/1   Running
kube-system nginx-ingress-controller-6fc5bcc  1/1   Running

Identify the Ingress Pod (which might be in a different Namespace) and describe it to retrieve the port:

kubectl describe pod nginx-ingress-controller-6fc5bcc \
 --namespace kube-system \
 | grep Ports
Ports:         80/TCP, 443/TCP, 18080/TCP

Finally, connect to the Pod:

kubectl port-forward nginx-ingress-controller-6fc5bcc 3000:80 --namespace kube-system

At this point, every time you visit port 3000 on your computer, the request is forwarded to port 80 on the Ingress controller Pod.

If you visit http://localhost:3000, you should find the app serving a web page.

Recap on ports

Here's a quick recap on what ports and labels should match:

  1. The Service selector should match the label of the Pod
  2. The Service targetPort should match the containerPort of the container inside the Pod
  3. The Service port can be any number. Multiple Services can use the same port because they have different IP addresses assigned.
  4. The servicePort of the Ingress should match the port in the Service
  5. The name of the Service should match the field serviceName in the Ingress

Knowing how to structure your YAML definition is only part of the story.

What happens when something goes wrong?

Perhaps the Pod doesn't start, or it's crashing.

3 steps to troubleshoot Kubernetes deployments

It's essential to have a well defined mental model of how Kubernetes works before diving into debugging a broken deployment.

Since there are three components in every deployment, you should debug all of them in order, starting from the bottom.

  1. You should make sure that your Pods are running, then
  2. Focus on getting the Service to route traffic to the Pods and then
  3. Check that the Ingress is correctly configured

    You should start troubleshooting your deployments from the bottom. First, check that the Pod is Ready and Running.


If the Pods is Ready, you should investigate if the Service can distribute traffic to the Pods.


Finally, you should examine the connection between the Service and the Ingress.

1. Troubleshooting Pods

Most of the time, the issue is in the Pod itself.

You should make sure that your Pods are Running and Ready.

How do you check that?

kubectl get pods
NAME                    READY STATUS            RESTARTS  AGE
app1                    0/1   ImagePullBackOff  0         47h
app2                    0/1   Error             0         47h
app3-76f9fcd46b-xbv4k   1/1   Running           1         47h

In the above session, the last Pod is Running and Ready — however, the first two Pods are neither Running nor Ready.

How do you investigate on what went wrong?

There are four useful commands to troubleshoot Pods:

  1. kubectl logs <pod name> is helpful to retrieve the logs of the containers of the Pod
  2. kubectl describe pod <pod name> is useful to retrieve a list of events associated with the Pod
  3. kubectl get pod <pod name> is useful to extract the YAML definition of the Pod as stored in Kubernetes
  4. kubectl exec -ti <pod name> bash is useful to run an interactive command within one of the containers of the Pod

Which one should you use?

There isn't a one-size-fits-all.

Instead, you should use a combination of them.

Common Pods errors

Pods can have startup and runtime errors.

Startup errors include:

  • ImagePullBackoff
  • ImageInspectError
  • ErrImagePull
  • ErrImageNeverPull
  • RegistryUnavailable
  • InvalidImageName

Runtime errors include:

  • CrashLoopBackOff
  • RunContainerError
  • KillContainerError
  • VerifyNonRootError
  • RunInitContainerError
  • CreatePodSandboxError
  • ConfigPodSandboxError
  • KillPodSandboxError
  • SetupNetworkError
  • TeardownNetworkError

Some errors are more common than others.

The following is a list of the most common error and how you can fix them.

ImagePullBackOff

This error appears when Kubernetes isn't able to retrieve the image for one of the containers of the Pod.

There are three common culprits:

  1. The image name is invalid — as an example, you misspelt the name, or the image does not exist
  2. You specified a non-existing tag for the image
  3. The image that you're trying to retrieve belongs to a private registry, and Kubernetes doesn't have credentials to access it

The first two cases can be solved by correcting the image name and tag.

For the last, you should add the credentials to your private registry in a Secret and reference it in your Pods.

The official documentation has an example about how you could to that.

CrashLoopBackOff

If the container can't start, then Kubernetes shows the CrashLoopBackOff message as a status.

Usually, a container can't start when:

  1. There's an error in the application that prevents it from starting
  2. You misconfigured the container
  3. The Liveness probe failed too many times

You should try and retrieve the logs from that container to investigate why it failed.

If you can't see the logs because your container is restarting too quickly, you can use the following command:

kubectl logs <pod-name> --previous

Which prints the error messages from the previous container.

RunContainerError

The error appears when the container is unable to start.

That's even before the application inside the container starts.

The issue is usually due to misconfiguration such as:

  • mounting a not-existent volume such as ConfigMap or Secrets
  • mounting a read-only volume as read-write

You should use kubectl describe pod <pod-name> to collect and analyse the error.

Pods in a Pending state

When you create a Pod, the Pod stays in the Pending state.

Why?

Assuming that your scheduler component is running fine, here are the causes:

  1. The cluster doesn't have enough resources such as CPU and memory to run the Pod
  2. The current Namespace has a ResourceQuota object and creating the Pod will make the Namespace go over the quota
  3. The Pod is bound to a Pending PersistentVolumeClaim

Your best option is to inspect the Events section in the kubectl describe command:

kubectl describe pod <pod name>

For errors that are created as a result of ResourceQuotas, you can inspect the logs of the cluster with:

kubectl get events --sort-by=.metadata.creationTimestamp

Pods in a not Ready state

If a Pod is Running but not Ready it means that the Readiness probe is failing.

When the Readiness probe is failing, the Pod isn't attached to the Service, and no traffic is forwarded to that instance.

A failing Readiness probe is an application-specific error, so you should inspect the Events section in kubectl describe to identify the error.

2. Troubleshooting Services

If your Pods are Running and Ready, but you're still unable to receive a response from your app, you should check if the Service is configured correctly.

Services are designed to route the traffic to Pods based on their labels.

So the first thing that you should check is how many Pods are targeted by the Service.

You can do so by checking the Endpoints in the Service:

kubectl describe service <service-name> | grep Endpoints

An endpoint is a pair of <ip address:port>, and there should be at least one — when the Service targets (at least) a Pod.

If the "Endpoints" section is empty, there are two explanations:

  1. you don't have any Pod running with the correct label (hint: you should check if you are in the right namespace)
  2. You have a typo in the selector labels of the Service

If you see a list of endpoints, but still can't access your application, then the targetPort in your service is the likely culprit.

How do you test the Service?

Regardless of the type of Service, you can use kubectl port-forward to connect to it:

kubectl port-forward service/<service-name> 3000:80

Where:

  • <service-name> is the name of the Service
  • 3000 is the port that you wish to open on your computer
  • 80 is the port exposed by the Service
3. Troubleshooting Ingress

If you've reached this section, then:

  • the Pods are Running and Ready
  • the Service distributes the traffic to the Pod

But you still can't see a response from your app.

It means that most likely, the Ingress is misconfigured.

Since the Ingress controller being used is a third-party component in the cluster, there are different debugging techniques depending on the type of Ingress controller.

But before diving into Ingress specific tools, there's something straightforward that you could check.

The Ingress uses the serviceName and servicePort to connect to the Service.

You should check that those are correctly configured.

You can inspect that the Ingress is correctly configured with:

kubectl describe ingress <ingress-name>

If the Backend column is empty, then there must be an error in the configuration.

If you can see the endpoints in the Backend column, but still can't access the application, the issue is likely to be:

  • how you exposed your Ingress to the public internet
  • how you exposed your cluster to the public internet

You can isolate infrastructure issues from Ingress by connecting to the Ingress Pod directly.

First, retrieve the Pod for your Ingress controller (which could be located in a different namespace):

kubectl get pods --all-namespaces
NAMESPACE   NAME                              READY STATUS
kube-system coredns-5644d7b6d9-jn7cq          1/1   Running
kube-system etcd-minikube                     1/1   Running
kube-system kube-apiserver-minikube           1/1   Running
kube-system kube-controller-manager-minikube  1/1   Running
kube-system kube-proxy-zvf2h                  1/1   Running
kube-system kube-scheduler-minikube           1/1   Running
kube-system nginx-ingress-controller-6fc5bcc  1/1   Running

Describe it to retrieve the port:

kubectl describe pod nginx-ingress-controller-6fc5bcc
 --namespace kube-system \
 | grep Ports

Finally, connect to the Pod:

kubectl port-forward nginx-ingress-controller-6fc5bcc 3000:80 --namespace kube-system

At this point, every time you visit port 3000 on your computer, the request is forwarded to port 80 on the Pod.

Does it works now?

  • If it works, the issue is in the infrastructure. You should investigate how the traffic is routed to your cluster.
  • If it doesn't work, the problem is in the Ingress controller. You should debug the Ingress.

If you still can't get the Ingress controller to work, you should start debugging it.

There are many different versions of Ingress controllers.

Popular options include Nginx, HAProxy, Traefik, etc.

You should consult the documentation of your Ingress controller to find a troubleshooting guide.

Since Ingress Nginx is the most popular Ingress controller, we included a few tips for it in the next section.

Debugging Ingress Nginx

The Ingress-nginx project has an official plugin for Kubectl.

You can use kubectl ingress-nginx to:

  • inspect logs, backends, certs, etc.
  • connect to the Ingress
  • examine the current configuration

The three commands that you should try are:

  • kubectl ingress-nginx lint, which checks the nginx.conf
  • kubectl ingress-nginx backend, to inspect the backend (similar to kubectl describe ingress <ingress-name>)
  • kubectl ingress-nginx logs, to check the logs

Please notice that you might need to specify the correct namespace for your Ingress controller with --namespace <name>.

Summary

Troubleshooting in Kubernetes can be a daunting task if you don't know where to start.

You should always remember to approach the problem bottom-up: start with the Pods and move up the stack with Service and Ingress.

The same debugging techniques that you learnt in this article can be applied to other objects such as:

  • failing Jobs and CronJobs
  • StatefulSets and DaemonSets

How Kubernetes Helps to Enable DevOps

How Kubernetes Helps to Enable DevOps

The automation and infrastructural capabilities of Kubernetes makes it an ideal technological partner for DevOps. In this article, you'll see 10 ways Kubernetes enables DevOps

The automation and infrastructural capabilities of Kubernetes makes it an ideal technological partner for DevOps. In this article, you'll see 10 ways Kubernetes enables DevOps

DevOps was an idea before its time.

It actually took a while for technology to catch up and fully implement the principles and vision of DevOps, but that is how innovation works.

A new set of tools in our toolboxes (though it's now taking up more and more mindshare) is end-to-end automation, and Kubernetes helps you manage that. Kubernetes is an open source framework for "automating deployment, scaling, and management of containerized applications." Originally introduced by Google, Red Hat, and others pushed it forward the last 3-4 years.

At the 2017 All Day DevOps conference, Siamak Sadeghianfar laid out 10 ways Kubernetes enables DevOps. We thought it was worth revisiting what he had to say as **Kubernetes **only becomes more prominent.

**So, how exactly does Kubernetes enable DevOps? Let's dive into what Siamak had to say: **

1. Deployment automation.

You should automate every step of your delivery pipeline. Kubernetes automates the deployment of containers. Each component becomes a container image.

**2. Infrastructure as code. **

With **Kubernetes **your entire infrastructure is code. This means any part in your application (databases, ports, access controls, etc.) can be described in a way **Kubernetes **can use. For example, you store your infrastructure code in a version-control repository. **Kubernetes **takes the code, and based on instruction, deploys, and maintains your infrastructure. This happens automatically and consistently.

3. Configuration as code.

**Kubernetes **allows you to "configure as code." Traditionally, an admin had to run configuration scripts manually to make sure they get the right one. **Kubernetes **keeps the file in the source repo. This allows you to describe where the file needs to go in the container and how the application consumes it. Additionally, it can be version controlled.

4. Immutable infrastructure.

Since the inception of servers, there's been one problem: every time you put out a fire you change the state of the virtual machine. Eventually, you don't know exactly what the server looks like and you can't recreate it, even though you know it works. (These virtual machines are referred to as snowflakes.)

In Kubernetes, new containers are immutable; they are created when there is a problem with the original state. So, you can be confident that you know exactly how the new environment is configured.

5. On-Demand Infrastructure.

Developers can create hybrid services and infrastructure on-demand from the self-service catalog. This gives control to developers to get the resources they need, yet allows operations to control the configuration of the services. Kubernetes follows open service and API standards so you can expose cloud services.

6. Environment consistency.

Build once, and deploy into production-like environments everywhere. Kubernetes allows you to build a golden image and use the exact same image for every single place you deploy your container. This gives you a consistent, production-like environment so it is the exact same whether you are on your local Windows or Mac development machine, a test server, etc. This helps you "shift left" so that you can see issues when you are in development instead of in production.

7. Continuous Delivery pipeline.

This is a series of automated steps to test code before it goes in production. Jez Humble says the role of Continuous Delivery is, "an automated process to prove to you a change is a bad change and it should not go into production." Continuous Integration (CI) means every change is tested. Continuous Delivery (CD) automates all the way to production.

How can you tell if you have fully implemented Continuous Delivery? Siamak says that the test is if someone walks in and says, "Can you go to production right now?" can you do it without breaking a sweat?

8. Zero downtime deployments.

When you have several deployments a day, you can't pull down production to deploy. You must have safe, rolling updates without disrupting the production traffic. Kubernetes helps with blue/green deployments so that you can set up a new environment and switch to the new one without downtime.

9. A/B Testing.

How can you evaluate proposed changes, such as changing copy or the color of buttons? A/B testing delivers different versions to different customers so you can get real-world test results. Kubernetes manages routing traffic to different versions. The same version with slight modifications are possible.

10. Cross-functional collaboration.

This is shared access to environments with granular control. Kubernetes goes beyond the silos so everyone has the same set of environments, but you can grant access to different roles and allow different roles to do different things. For instance, developers may be able to push to production, while Infosec may have view access, and QA may have access to live containers, but not building or deploying. The entire team can access production, but operations is the only one who can make containers.

The growing popularity of Kubernetes is undeniable. Luckily, with these 10 capabilities, Siamak has laid out a great path for us all to follow as they continue to not only make their way into the DevOps pipeline, but in many ways enable it.

What is Kubernetes | Kubernetes Tutorial For Beginners

What is Kubernetes | Kubernetes Tutorial For Beginners

This video on "What is Kubernetes | Kubernetes Tutorial For Beginners" will give you an introduction to one of the most popular Devops tool in the market - Kubernetes, and its importance in today's IT processes. This tutorial is ideal for beginners who want to get started with Kubernetes & DevOps

What Is Kubernetes | Kubernetes Introduction | Kubernetes Tutorial For Beginners

The following topics are covered in this training session:

  1. Need for Kubernetes
  2. What is Kubernetes and What it's not
  3. How does Kubernetes work?
  4. Use-Case: Kubernetes @ Pokemon Go
  5. Hands-on: Deployment with Kubernetes

Building and Managing Kubernetes with Kubernetes

Building and Managing Kubernetes with Kubernetes

Building and Managing Kubernetes with Kubernetes: Kubernetes as a declarative and portable system can be used to do many things in different ways.

Kubernetes as a declarative and portable system can be used to do many things in different ways.

At eBay we built a fleet management system based on k8s. Everything(server, subnet, OS, package and state) is declarative and can be modeled as CRDs in k8s, or referred to as a commit id in git from the objects. By running various controllers on top of these CRD objects, we use k8s to manage k8s, and the entire eBay data center. - Our system provisions hosts the same way k8s creates and manages pods. - We build k8s clusters with Salt. each host has a set of states defined in its salt CRD object. controllers pull states from git based on commit ids to apply. - We build both schedulers and deployment transactions to manage the k8s clusters for both config deployments and upgrades. This declarative, highly scalable, auto healing, and cloud native system is what we think can unify eBay’s fleet.

Thanks for reading

If you liked this post, share it with all of your programming buddies!

Follow us on Facebook | Twitter

Further reading about Kubernetes

An illustrated guide to Kubernetes Networking

Google Kubernetes Engine By Example

An Introduction to the Kubernetes DNS Service

Deploying a Laravel app in Kubernetes on Google Cloud

How to build a Microservice Architecture with Spring Boot and Kubernetes?