How to Troubleshoot Kubernetes Pods Stuck in Pending State

Learn how to troubleshoot Kubernetes Pods stuck in a 'Pending' state. Follow our guide to resolve the issue and get your Pods running smoothly.

Patrick Londa

Jan 19, 2022

•

min read

Share this post

So you’re using Kubernetes to manage your containerized services, but you’ve run into a snag. Your project isn’t loading, and the Pods are stuck in a pending state. Fortunately, Kubernetes has helpful debugging tools that can readily streamline the troubleshooting process. Use this step-by-step guide to troubleshoot Kubernetes Pods stuck in a pending state.

What Does “Pending” Mean?

Kubernetes Pods are left pending if they can’t be scheduled to a node. The “kubectl describe pods“ command should display messages from the scheduler explaining why your pod can’t be scheduled to a node.

How Does a Pod Become Stuck in a “Pending” State?

There are two common reasons for a pod to fail to be scheduled to a node. First, it may be bound to hostPort. Second, you may have insufficient resources (usually memory or CPU).

Blink Automation: Troubleshoot Pods Stuck in "Pending" State

Blink + Kubernetes

Try This Automation

Manually Troubleshooting Pods Stuck in “Pending”

Now that you understand more about “stuck Pods”, follow these steps to manually troubleshoot a Kubernetes pod stuck in a pending state.

Step 1: Diagnosing the Issue

The first step in any kind of Kubernetes troubleshooting is to run the command:

kubectl describe pods

This command will return a basic description of each of your Pods, including their state. In the output, you’ll also be able to see if you have reached CPU, memory, or network limits. This is one of the most likely reasons for a pod remaining in the “pending” state.

2. Scale out, scale up

If you have reached resource limits, then you can increase capacity by scaling out or scaling up.

You scale out by adding more worker nodes to the cluster. You can do this in a variety of ways depending on which cloud infrastructure you are using. As a starting point, here is a basic kubernetes guide on how to add nodes to an existing cluster.

To scale up, you instead need to increase the node memory or CPU on your existing nodes.

3. Reduce Your Resource Requests

If you don’t want to add capacity by scaling out or scaling up, another option is to reduce your existing resource requests. You can make this change by editing the following configuration arguments in your manifest YAML file.

spec.containers[].resources.requests.cpu
spec.containers[].resources.requests.memory
spec.containers[].resources.requests.hugepages-<size>

After you apply these changes, you will have reduced the resources needed on deployment.

Another option that has a similar effect is to remove unneeded deployments and resources to free up space. Cleaning up your resources is a good regular practice regardless of running into errors like this, since it can reduce costs.

Troubleshoot Faster with Blink:

You might have solved your problem quickly or ended up down a research rabbit hole. With Blink, you can manage your Kubernetes troubleshooting in one place with the common error causes listed and your next steps just a click away.

Blink Automation: Troubleshoot a Kubernetes Pod — *Blink Automation:* *Troubleshoot a Kubernetes Pod*

This automation above is in the Blink library. When the automation runs, gets the key details you need to troubleshoot a given Pod in a namespace.

It does the following steps:

Gets the Pod status.
Gets the Pod details.
Gets the container logs.
Gets events related to the Pod.

With one automation, you skip the kubectl commands and get the information you need to correct the error.

Get started with Blink and troubleshoot Kubernetes errors faster today.

Expert Tip

Company News

Meet Niv Schneiderman: Senior Software Engineer at Blink Ops

Meet Niv Schneiderman, Senior Software Engineer at Blink Ops. Discover his passion for automation, tech insights, and explore his advice for new engineers.

How-To Guides

Ensure JumpCloud Compliance with Okta: A Step-by-Step Guide

Learn how to use Okta to identify users without JumpCloud installed on their devices. Verify compliance and enhance device security with this guide.

Articles

Weekly Workflow: How to Find and Remove Unused Security Groups in AWS in 10 Seconds

75 AWS account owners alerted! Are you one of them? Unused security groups not only create a security risk, but also may be a time sink to clean up. Blink Copilot caan help you fix that.

Automate your security operations everywhere.

Blink is secure, decentralized, and cloud-native.  Get modern cloud and security operations today.

Get a Demo