Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot

MANI_M · July 29, 2020, 1:41am

Hi There,
We have a deployment with an init container where we are checking whether flexVolume is ready to be mounted. Here are the details

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: test
  namespace: replicated-namespace
spec:
  progressDeadlineSeconds: 600
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: xyz
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: xyz
        tier: backend
    spec:
      initContainers:
      - command:
        - /bin/sh
        - -c
        - df $MOUNT_PATH | grep ":6789"
        env:
        - name: MOUNT_PATH
          value: /sharedfs
        image: docker.io/replicated/replicated-operator:stable-2.46.2
        imagePullPolicy: IfNotPresent
        name: check-mount
        resources: {}
        securityContext:
          seLinuxOptions:
            type: spc_t
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /sharedfs
          name: shared-mount
          readOnly: true
      containers:
      ...
      ...
      ...
      restartPolicy: Always
      volumes:
      - flexVolume:
          driver: ceph.rook.io/rook
          fsType: ceph
          options:
            clusterNamespace: rook-ceph
            fsName: rook-shared-fs
        name: shared-mount

Now, whenever we restart the node the Deployment goes into CrashloopBackoff. Even though Due to restart policy the deployment is recreated by k8s, it never recovers.
It works only if we manually delete and recreate the deployment after node restart. Is there any way to get this working?

areed · July 29, 2020, 3:03pm

After reboots there is a race condition in which mounting the ceph shared filesystem usually fails. The initContainer is meant to protect against running with an unsuccessful mount. You can force delete the individual pod and allow it to be re-created by Kubernetes.

https://help.replicated.com/docs/kubernetes/packaging-an-application/volumes/#shared-filesystem-initcontainer

MANI_M · July 29, 2020, 4:43pm

How can this be avoided in production, Manual deletion and re-deployment may not be feasible.

areed · July 29, 2020, 4:53pm

If you’re running a clustered setup, those pods would be scheduled on other nodes when one goes down for reboot. They would not be automatically scheduled on the rebooted node (unless in a DaemonSet) so there would not be a race condition.

We’re also looking into ways to prevent or automatically fix the issue for upcoming releases.

MANI_M · July 29, 2020, 5:08pm

Thanks, @areed for the quick response. Yeah in dev env we are facing this issue as we are running single node cluster. In production, it should not be a problem as it is a multi-node cluster.

Topic		Replies	Views
Replicated shared snapshotter doesn't comes up on node restart Packaging an application	1	725	August 21, 2020
Error running installation script (Rook 1.0.4 is not compatible with Kubernetes 1.20+) Supporting your customers kurl	34	1600	January 31, 2022
Using Kubernetes Local Persistent Volumes Packaging an application	0	1335	March 12, 2020
How to recover Rook-Ceph cluster when missing files under /var/lib/rook/exporter/ How do I? kurl	0	347	October 17, 2023
Replicated pods not able to read the config maps Supporting your customers	6	832	March 25, 2020

Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot

Related topics