StatefulSet pods stuck in Completed after node restart

diamon_w · June 7, 2024, 7:37pm

Whenever pods part of a StatefulSet are gracefully terminated, commonly observed during reboots, its possible for them to be left in a state of Completed which will cause other applications running in the cluster relying on those services to fail to start.

This appears to be a regression in Kubernetes 1.27 and the issue is being tracked here - StatefulSet pod ends up in state "Completed" · Issue #124065 · kubernetes/kubernetes · GitHub.

To recover from this you can simply delete the Completed pod so the StatefulSet controller will create a new pod.

kubectl delete pod --field-selector=status.phase=Succeeded --all

diamon_w · June 7, 2024, 8:53pm

In the upstream issue its mentioned that this regression is fixed in recent patch versions. For 1.27 it was fixed with 1.27.9

github.com/kubernetes/kubernetes

StatefulSet pod ends up in state "Completed"

opened 09:47PM - 26 Mar 24 UTC

roy-work

kind/bug sig/apps needs-triage

### What happened? A `StatefulSet` pod ends up "`Completed`": ``` NAME … READY STATUS RESTARTS AGE an-sts-pod-0 0/1 Completed 0 4d12h ``` This pod has a `restartPolicy: Always`. ``` State: Terminated Reason: Completed Exit Code: 0 Started: Fri, 22 Mar 2024 05:12:42 -0400 Finished: Sun, 24 Mar 2024 12:11:32 -0400 ``` The pod terminated due to something external to it (presumably the kubelet) terminating it: ``` [2024-03-24T16:11:29Z INFO actix_server::server] SIGTERM received; starting graceful shutdown [2024-03-24T16:11:29Z DEBUG actix_server::accept] paused accepting connections on 0.0.0.0:8080 [2024-03-24T16:11:29Z INFO actix_server::accept] accept thread stopped [2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker [2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker [2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker [2024-03-24T16:11:30Z DEBUG tower::buffer::worker] buffer closing; waking pending tasks ``` Shortly prior to this, the `kubelet` is restarted: ``` Mar 24 16:10:41 gke-[snip]-bbzn systemd[1]: kubelet.service: Sent signal SIGTERM to main process 1874 (kubelet) on client request. ``` It then seems to restart, and presumably `SIGTERM`s everything? Its logs are chock full of errors after the restart. ``` Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.577436 4021733 eviction_manager.go:174] "Failed to admit pod to node" pod="ns/an-sts-pod-0" nodeCondition=[MemoryPressure] ``` (there are lots of these, presumably one for each pod on the node) Then it seems like the pod dies, and is attempted to be restarted? ``` Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588544 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d 23-730abcb24f60 Type:ContainerDied Data:73c54c77f4bdc8f227d7ff45e662d05f7aeb2f5374712f338c66feab2dcd0c38} Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588564 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d 23-730abcb24f60 Type:ContainerStarted Data:50348bfeb6c694e1504af763602f94786c47b2cb5299747cb0e6da30b01ddc68} Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588580 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d 23-730abcb24f60 Type:ContainerStarted Data:aa9e60de9dd97a3f282c4df133656fa7b6e5cea40fede3c1422b654cb9f438ea} ``` but it never happens: ``` Mar 24 16:11:32 gke-[snip]-bbzn kubelet[4021733]: E0324 16:11:32.666674 4021733 secret.go:194] Couldn't get secret ns/a-secret-this-pod-needs: object "ns"/"a-secret-this-pod-needs" not registered ``` There are *lots* of these, for numerous secrets. These persist for at least the next 8 seconds. The next time the pod's name shows up in the log is several minute later (so presumably the "object not registered" error abates?) ``` Mar 24 16:16:37 gke-[snip]-bbzn kubelet[4021733]: E0324 16:16:37.047670 4021733 cpu_manager.go:395] "RemoveStaleState: removing container" podUID="44a879b9-bd14-43ec-8d23-730abcb24f60" containerName="an-sts-pod" Mar 24 16:16:37 gke-[snip]-bbzn kubelet[4021733]: I0324 16:16:37.047682 4021733 state_mem.go:107] "Deleted CPUSet assignment" podUID="44a879b9-bd14-43ec-8d23-730abcb24f60" containerName="an-sts-pod" ``` After that, it seems wedged. The `kubelet` appears to never attempt to restart the pod, despite the restart policy. ### What did you expect to happen? The `StatefulSet` controller to restart the pod, or the `kubelet` to restart the pod; I'm not clear which of these two would be responsible. ### How can we reproduce it (as minimally and precisely as possible)? I'm not clear on what gets me into this situation. ### Anything else we need to know? There's an old SO thread out there that claims that Docker restarts cause this. That bug seems to be in an utterly ancient version (1.x) of Docker; I am past the fixed version for that bug. Regardless, I checked the logs for `docker`; the Docker daemon does not appear to have restarted when the pod died. Shortly prior to the `Finished` timestamp, Cloud Logs indicate that the node that pod was on was under Memory Pressure. Presumably, it terminated that pod due to that, but nonetheless, I would expect it to then get evicted. ### Kubernetes version ```console $ kubectl version Server Version: v1.27.7-gke.1121002 ``` ### Cloud provider GCP, GKE ### OS version <details> ```console # On Linux: # cat /etc/os-release NAME="Container-Optimized OS" ID=cos PRETTY_NAME="Container-Optimized OS from Google" HOME_URL="https://cloud.google.com/container-optimized-os/docs" BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us" GOOGLE_CRASH_ID=Lakitu GOOGLE_METRICS_PRODUCT_ID=26 KERNEL_COMMIT_ID=37c79a2c2008543e2c9a5dc749faa91fb0d806b5 VERSION=105 VERSION_ID=105 BUILD_ID=17412.226.62 # paste output here # uname -a Linux gke-[snip] 5.15.133+ #1 SMP Sat Dec 30 13:01:38 UTC 2023 x86_64 Intel(R) Xeon(R) CPU @ 2.20GHz GenuineIntel GNU/Linux ``` </details> ### Install tools N/A ### Container runtime (CRI) and version (if applicable) `containerd://1.7.10` ### Related plugins (CNI, CSI, ...) and versions (if applicable) (Dunno.)

Topic		Replies	Views
Pods Stuck in CreateContainerConfigError After Upgrading to Kubernetes v1.31 Supporting your customers support , embedded-cluster , k0s	0	56	July 2, 2025
Error running installation script (Rook 1.0.4 is not compatible with Kubernetes 1.20+) Supporting your customers kurl	34	1584	January 31, 2022
How to resolve Helm Error: UPGRADE FAILED: another operation (install/upgrade/rollback) is in progress Troubleshooting	0	1599	January 2, 2024
Replicated shared snapshotter doesn't comes up on node restart Packaging an application	1	725	August 21, 2020
Managing nodes when the previous Rook version is in use might leave Ceph in an unhealthy state where mon pods are not rescheduled Supporting your customers kurl , rook	0	512	January 24, 2023

StatefulSet pods stuck in Completed after node restart

Related topics