FailedCreatePodContainer - unable to ensure pod container exists: failed to create container
Revision | Date | Description |
|---|---|---|
| 24.07.2024 | Init Changelog |
Problem
During pod creation on K8s, FailedCreatePodContainer warning shows. Event message will look like:
unable to ensure pod container exists: failed to create container for [kubepods burtable podbb4b05d1-1506-49df-9321-cfa434373319[] : mkdir /sys/fs/cgroup/memory/kubepods/burstable/podbb4b05d1-1506-49df-9321-cfa434373319: cannot allocate memory
This completely blocks pod creation and is connected with kubelet / CRI on Kubernetes Node.
Requirements
To fix problem you will need:
Workstation:
kubectlinstalled.
Kubernetes:
RO access to
Eventson cluster.Permissions to run
drain.
Node:
Permissions to check running services on virtual machine.
Permissions to stop and start
kubeletservice.Permissions to stop and stop CRI service (
docker,containerd, etc.).
Solution
To solve problem you need to:
Get corrupted Node name:
kubectl get events -A --field-selector reason=FailedCreatePodContainer -o=custom-columns=KIND:.involvedObject.kind,NAMESPACE:.involvedObject.namespace,NAME:.involvedObject.name,NODE:.source.host,REASON:.reason,MESSAGE:.messageGet all pods running on Node (remember to set Node name in
--field-selectoroption):kubectl get pods -A --field-selector spec.nodeName=<node_name> --field-selector status.phase=Running -o custom-columns=NODE:.spec.nodeName,NAMESPACE:.metadata.namespace,NAME:.metadata.name | grep <node_name>Copy pod list and send it on Teams with info about Node drain.
Drain corrupted Node (remember to set Node name in command):
kubectl drain --grace-period=-1 --force --ignore-daemonsets --delete-emptydir-data <node_name>Log into corrupted Node:
ssh <node_name>Check running CRI service on Node - it will be docker or containerd:
sudo systemctl list-units --type serviceStop and start kubelet and CRI services - do not mess with command order:
sudo service kubelet stop && sudo service docker stop sudo service docker start && sudo service kubelet startsudo service kubelet stop && sudo service containerd stop sudo service containerd start && sudo service kubelet startUncordon Node (remember to set Node name in command):
kubectl uncordon <node_name>