Debugging Kubernetes Pods: Deep Dive

In this article, I will talk about debugging and troubleshooting Kubernetes pods using ephemeral containers.

Amr Farid
Better Programming

--

Photo by Zan on Unsplash

The simplest way to debug pods is to exec into the problematic pods and try to troubleshoot what is happening. This is a simple approach but it has many drawbacks.

  • The running application pods may not have all the required tools to troubleshoot an existing issue.
  • If you want to perform some actions that require additional permissions, you will need to restart all pods for the currently running application pods to add the new requirements.
  • They are introducing security risks by adding debugging tools inside the main docker image, also if container permissions are elevated.

so, let’s explore another way to debug pods.

Debugging with an ephemeral debug container

Ephemeral containers are useful for interactive troubleshooting when kubectl exec is insufficient because a container has crashed or a container image doesn't include debugging utilities, such as distroless images, or the running pods don’t have the required privileges for debugging.

The main idea behind ephemeral containers is that K8S adds a new container with a selected custom image to an existing pod without the need for restarting this pod. This new container share can share many resources from the target containers which are,

  • Linux network namespace
  • Linux process namespace
  • Access to shared volumes
  • Access to k8s node

I will give an example for each of these use cases.

Before starting the demo, you need to have a k8s cluster with version 1.23. I recommend using kind , but you can use any other provisioner.

so let’s start by creating a cluster for our demo

Creating kind cluster

Creating a new kind of cluster is simple as running the command kind create cluster

Example:

Once the cluster is created, you need to verify that it is up and accessible

Example:

All of our operations will be executed from master kind node so we will need to access it by docker exec -it <kind-container-id> bash

Example:

Creating simple workload

we will assume that we have an Nginx deployment that we want to debug, so let’s create an Nginx deployment with one replica. This will be done by running this command

kubectl create deployment nginx --image=nginx

Troubleshooting network activity

Troubleshooting network activity requires sharing network namespace, This is the default Linux namespace when you attach an ephemeral container to a running pod.

let’s create our first ephemeral container, I will use knicolaka/netshoot as an image for the new ephemeral container. This image contains many troubleshooting tools like tcpdump and strace

kubectl debug --it pod-name --image=<ephemeral-container> -- command

Example:

So let’s confirm that both containers share the same Linux namespace. Open a new shell to the master node, and run this command

systemd-cgls -u kubelet-kubepods-besteffort.slice

Example:

From the above example, we can get the main process IDs for both containers

  • 2612 -> main process ID for the ephemeral container
  • 2259 -> main process ID for the Nginx container

Now, let’s check all Linux namespaces for each of these processes

From the previous screenshot, we found that both processes have the same Linux network namespace id.

Now let's dump network packets for the Nginx container from the ephemeral containers.

From the ephemeral container shell, run this command

tcpdump -n port 80

Example output:

Now, try to send some requests to this pod from the k8s master node

curl http://pod-ip-adderss

Now, if you go to the ephemeral container terminal, you will find the dump of TCP packets is printed to the output:

We finished our 1st demo and now, we can capture the network packets from the ephemeral container.

Let’s go to the second use case.

Tracing/profiling processes using ephemeral containers.

Our next use case for ephemeral containers is tracing a process running in a container from another container.

To achieve this, we will need:

  • The two containers must share the same Linux process namespace.
  • The ephemeral container must have a Linux capability SYS_PTRACE

Sharing a Linux process namespace is can be done easily when creating the ephemeral container by adding an additional argument --target=<container-name>

kubectk debug -it <pod-name > --image=nicolaka/netshoot --target <container-name> -- bash

Example:

As you can see from the previous screenshot:

  1. In order to share the process namespace, we can just add an additional command argument--targer=<cotainer-name>
  2. From the ephemeral container, we can see all running processes from nginx container
  3. we can’t trace nginx process as the ephemeral containers doesn't have the required permission to sendptrace system call. This system call is used by strace command to pause the Linux process to record each system call that nginx sends to the kernel.

How can we fix this? Unfortunately, I didn’t find a way to pass extra permissions to the ephemeral container from kubectl command. So we will construct and send an HTTP request to kube API server without the use of kubectl command.

Now, You can strace without getting permission denied.

For this example, I added a permission to do SYS_PTRACE. But it depends on the debugger you are using, or simply, you can give the ephemeral container privileged access. So you don’t need to worry about which system calls you need to allow.

Another thing to mention, you can access the filesystem for nginx container from the ephemeral container. The root file system is under /proc/<process-id>/root.

Let’s see if we can access nginx config from the ephemeral container.

Example:

Ok, let’s now start the last selected use case for ephemeral containers

Debugging via a shell on the node

Sometimes, you need access to k8s node, but you don’t have ssh access or console access to the node.

You can access the node by using an ephemeral container

kubectl debug node/<node-name> -it --image=<image-name>

When creating a debugging session on a node, keep in mind that:

  • kubectl debug automatically generates the name of the new Pod based on the name of the Node.
  • The container runs in the host IPC, Network, and PID namespaces.
  • The root filesystem of the Node will be mounted at /host.

If you want the root file system of the ephemeral container is the same as the node, You will just need to chroot to /host

Example:

References

--

--

SRE/DevOps Engineer, I write about k8s, monitoring, and microservices.