How to resolve Longhorn volume attachments errors after a reboot of a node?

diamon_w · February 28, 2023, 5:48pm

A known issue in Longhorn that can occur after a node is restarted in the cluster are volume attachment errors for Pods attempting to mount a PVC backed by Longhorn. These errors will be similar to:

volume pvc-xxxxxx has GET error for volume attachment csi-xxxxx: volumeattachments.storage.k8s.io “csi-xxxxxx” not found

or

MountVolume.WaitForAttach failed for volume “pvc-xxxxx” : volume pvc-xxxxxhas GET error for volume attachment csi-xxxxx: volumeattachments.storage.k8s.io “csi-xxxxx” is forbidden: User “system:node:ip-xxxxx” cannot get resource “volumeattachments” in API group “storage.k8s.io” at the cluster scope: no relationship found between node ‘ip-xxxxx’ and this object

When this happens, the workaround is to scale down the deployment or statefulset mounting the volume to 0 replicas, wait for the pod to fully terminate, and then scale the workload back up.

For all users of Replicated’s kURL installer, Replicated is recommend that you move away from Longhorn. For more details on why we’ve made this decision, see our blog post on the subject - Why Replicated has moved away from recommending Longhorn for kURL storage

Vitaliy · March 10, 2023, 3:13pm

Bumped this problem. After reboot of ec2 instance pods of kotsadm, kotsadm-rqlite and our application can’t attach volumes. Scaling of Statefulset doesn’t help.
In events of kotsadm pod:

  Warning  FailedMount         5m44s                 kubelet                  Unable to attach or mount volumes: unmounted volumes=[kotsadmdata], unattached volumes=[kotsadmdata backup host-cacerts kotsadm-web-scripts kubelet-client-cert kurl-proxy-kotsadm-tls-cert migrations kube-api-access-qld22]: timed out waiting for the condition
  Warning  FailedMount         3m27s                 kubelet                  Unable to attach or mount volumes: unmounted volumes=[kotsadmdata], unattached volumes=[kubelet-client-cert kurl-proxy-kotsadm-tls-cert migrations kube-api-access-qld22 kotsadmdata backup host-cacerts kotsadm-web-scripts]: timed out waiting for the condition
  Warning  FailedAttachVolume  90s (x11 over 7m47s)  attachdetach-controller  AttachVolume.Attach failed for volume "pvc-f44c135f-b394-4c3d-b2a9-a0bd957e8a29" : rpc error: code = Aborted desc = volume pvc-f44c135f-b394-4c3d-b2a9-a0bd957e8a29 is not ready for workloads
  Warning  FailedMount         73s                   kubelet                  Unable to attach or mount volumes: unmounted volumes=[kotsadmdata], unattached volumes=[backup host-cacerts kotsadm-web-scripts kubelet-client-cert kurl-proxy-kotsadm-tls-cert migrations kube-api-access-qld22 kotsadmdata]: timed out waiting for the condition

How can we resolve this issue?

Host OS is Ubuntu 20.04.5 LTS
Version of Longhorn is v1.2.4

diamon_w · April 28, 2023, 5:46pm

@Vitaliy Deeply sorry that your question here was missed. If you’re still seeing these issues after you’ve scaled the workload down to 0 and then back up, then further investigation may be required in which it would be best to open a support issue at Replicated with a support bundle from the affected environment.

Vitaliy · May 2, 2023, 9:39am

The problem could be caused by Longhorn provisioner. We are on the way of moving to OpenEBS.

Topic		Replies	Views
KOTS: Error launching Kubernetes App Supporting your customers	3	1202	July 22, 2021
How to: Troubleshooting openebs-localpv-provisioner Failures in kURL How do I? support , kurl	0	24	May 15, 2025
Flexvolume creates deadlock and Deployment enters into Crashloopbackoff on node reboot Packaging an application	4	728	July 29, 2020
Longhorn fails to start and displays 'failed to generate spec: path "/tmp/longhorn-environment-check" is mounted on "/tmp" but it is not a shared mount' in the pod Events table Troubleshooting	0	632	July 20, 2022
kURL: Diagnosing ip_forward being disabled Troubleshooting kurl	1	765	September 27, 2021

How to resolve Longhorn volume attachments errors after a reboot of a node?

Related topics