Whenever pods part of a StatefulSet are gracefully terminated, commonly observed during reboots, its possible for them to be left in a state of Completed
which will cause other applications running in the cluster relying on those services to fail to start.
This appears to be a regression in Kubernetes 1.27 and the issue is being tracked here - StatefulSet pod ends up in state "Completed" · Issue #124065 · kubernetes/kubernetes · GitHub .
To recover from this you can simply delete the Completed
pod so the StatefulSet controller will create a new pod.
kubectl delete pod --field-selector=status.phase=Succeeded --all
In the upstream issue its mentioned that this regression is fixed in recent patch versions. For 1.27
it was fixed with 1.27.9
opened 09:47PM - 26 Mar 24 UTC
kind/bug
sig/apps
needs-triage
### What happened?
A `StatefulSet` pod ends up "`Completed`":
```
NAME … READY STATUS RESTARTS AGE
an-sts-pod-0 0/1 Completed 0 4d12h
```
This pod has a `restartPolicy: Always`.
```
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 22 Mar 2024 05:12:42 -0400
Finished: Sun, 24 Mar 2024 12:11:32 -0400
```
The pod terminated due to something external to it (presumably the kubelet) terminating it:
```
[2024-03-24T16:11:29Z INFO actix_server::server] SIGTERM received; starting graceful shutdown
[2024-03-24T16:11:29Z DEBUG actix_server::accept] paused accepting connections on 0.0.0.0:8080
[2024-03-24T16:11:29Z INFO actix_server::accept] accept thread stopped
[2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker
[2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker
[2024-03-24T16:11:29Z INFO actix_server::worker] shutting down idle worker
[2024-03-24T16:11:30Z DEBUG tower::buffer::worker] buffer closing; waking pending tasks
```
Shortly prior to this, the `kubelet` is restarted:
```
Mar 24 16:10:41 gke-[snip]-bbzn systemd[1]: kubelet.service: Sent signal SIGTERM to main process 1874 (kubelet) on client request.
```
It then seems to restart, and presumably `SIGTERM`s everything? Its logs are chock full of errors after the restart.
```
Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.577436 4021733 eviction_manager.go:174] "Failed to admit pod to node" pod="ns/an-sts-pod-0" nodeCondition=[MemoryPressure]
```
(there are lots of these, presumably one for each pod on the node)
Then it seems like the pod dies, and is attempted to be restarted?
```
Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588544 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d
23-730abcb24f60 Type:ContainerDied Data:73c54c77f4bdc8f227d7ff45e662d05f7aeb2f5374712f338c66feab2dcd0c38}
Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588564 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d
23-730abcb24f60 Type:ContainerStarted Data:50348bfeb6c694e1504af763602f94786c47b2cb5299747cb0e6da30b01ddc68}
Mar 24 16:11:28 gke-[snip]-bbzn kubelet[4021733]: I0324 16:11:28.588580 4021733 kubelet.go:2375] "SyncLoop (PLEG): event for pod" pod="ns/an-sts-pod-0" event=&{ID:44a879b9-bd14-43ec-8d
23-730abcb24f60 Type:ContainerStarted Data:aa9e60de9dd97a3f282c4df133656fa7b6e5cea40fede3c1422b654cb9f438ea}
```
but it never happens:
```
Mar 24 16:11:32 gke-[snip]-bbzn kubelet[4021733]: E0324 16:11:32.666674 4021733 secret.go:194] Couldn't get secret ns/a-secret-this-pod-needs: object "ns"/"a-secret-this-pod-needs" not registered
```
There are *lots* of these, for numerous secrets. These persist for at least the next 8 seconds.
The next time the pod's name shows up in the log is several minute later (so presumably the "object not registered" error abates?)
```
Mar 24 16:16:37 gke-[snip]-bbzn kubelet[4021733]: E0324 16:16:37.047670 4021733 cpu_manager.go:395] "RemoveStaleState: removing container" podUID="44a879b9-bd14-43ec-8d23-730abcb24f60" containerName="an-sts-pod"
Mar 24 16:16:37 gke-[snip]-bbzn kubelet[4021733]: I0324 16:16:37.047682 4021733 state_mem.go:107] "Deleted CPUSet assignment" podUID="44a879b9-bd14-43ec-8d23-730abcb24f60" containerName="an-sts-pod"
```
After that, it seems wedged. The `kubelet` appears to never attempt to restart the pod, despite the restart policy.
### What did you expect to happen?
The `StatefulSet` controller to restart the pod, or the `kubelet` to restart the pod; I'm not clear which of these two would be responsible.
### How can we reproduce it (as minimally and precisely as possible)?
I'm not clear on what gets me into this situation.
### Anything else we need to know?
There's an old SO thread out there that claims that Docker restarts cause this. That bug seems to be in an utterly ancient version (1.x) of Docker; I am past the fixed version for that bug.
Regardless, I checked the logs for `docker`; the Docker daemon does not appear to have restarted when the pod died.
Shortly prior to the `Finished` timestamp, Cloud Logs indicate that the node that pod was on was under Memory Pressure. Presumably, it terminated that pod due to that, but nonetheless, I would expect it to then get evicted.
### Kubernetes version
```console
$ kubectl version
Server Version: v1.27.7-gke.1121002
```
### Cloud provider
GCP, GKE
### OS version
<details>
```console
# On Linux:
# cat /etc/os-release
NAME="Container-Optimized OS"
ID=cos
PRETTY_NAME="Container-Optimized OS from Google"
HOME_URL="https://cloud.google.com/container-optimized-os/docs"
BUG_REPORT_URL="https://cloud.google.com/container-optimized-os/docs/resources/support-policy#contact_us"
GOOGLE_CRASH_ID=Lakitu
GOOGLE_METRICS_PRODUCT_ID=26
KERNEL_COMMIT_ID=37c79a2c2008543e2c9a5dc749faa91fb0d806b5
VERSION=105
VERSION_ID=105
BUILD_ID=17412.226.62
# paste output here
# uname -a
Linux gke-[snip] 5.15.133+ #1 SMP Sat Dec 30 13:01:38 UTC 2023 x86_64 Intel(R) Xeon(R) CPU @ 2.20GHz GenuineIntel GNU/Linux
```
</details>
### Install tools
N/A
### Container runtime (CRI) and version (if applicable)
`containerd://1.7.10`
### Related plugins (CNI, CSI, ...) and versions (if applicable)
(Dunno.)