Replicated shared snapshotter doesn't comes up on node restart

MANI_M · August 21, 2020, 5:04pm

Whenever the primary node is restarted replicated-shared-fs-snapshotter-* doesn’t come up and goes into Init:CrashLoopBackOff we need to manually force delete the pod in order to fix it. Due to this the other pods which are dependent on the shared filesystem goes into Init:CrashLoopBackOff due to race condition as mentioned in this.
We are running on DigitalOcean 4vCPU 8GB machine. The cluster has one primary and 2 worker nodes. The issue is observed in a single primary node as well.

We have a single primary node running on AWS. There it works out fine. How to solve this? Screen shot attached for reference.

salahalsaleh · August 21, 2020, 5:58pm

Hello, we are tracking this and we will fix it in the next release, which is due next month.

Topic		Replies	Views
Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot Packaging an application	4	767	July 29, 2020
How to recover Rook-Ceph cluster when missing files under /var/lib/rook/exporter/ How do I? kurl	0	491	October 17, 2023
Replicated unable to join to cluster as primary node Packaging an application	0	774	August 24, 2020
Best Practices for Migrating a single-node instance from Replicated Native Scheduler to Replicated KOTS	3	397	July 1, 2022
StatefulSet pods stuck in Completed after node restart Supporting your customers support , kurl , kubernetes	1	657	June 7, 2024

Replicated shared snapshotter doesn't comes up on node restart

Related topics