DISCLAIMER
This procedure is unsupported by Replicated Support and only to be followed by knowledgeable experts. It solely serves as advice on how to perform a specific Linux/Kubernetes base technology task
If you realize that your Kubernetes clusters volumes have ran out of diskspace and pods might even be crashlooping because of this, causing an outage, then resizing your Ceph volumes is the way to fix this.
How-To
A note of importance before starting with tasks like the following is to be sure you have a proper backup or snapshot of the environment. Should anything fail or become unavailable.
Once you have determined the volume(s) that are in need of a resize and what their desired new size needs to be, make a note of them in your favorite editor and continue with the below steps:
- To determine which volume(s) need to be resized, you can run the following command. This will also allow you to take notes of the volumes for further use in this how-to:
Get an overview of the current persistent volumes by running: cat /etc/mtab | grep pvc
:
# cat /etc/mtab | grep pvc
/dev/rbd0 /var/lib/kubelet/pods/844f4c70-3d8d-4688-872c-c0220c5de54a/volumes/kubernetes.io~csi/pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d/mount ext4 rw,relatime,stripe=16 0 0
/dev/rbd0 /var/lib/kubelet/pods/844f4c70-3d8d-4688-872c-c0220c5de54a/volume-subpaths/pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d/prometheus/2 ext4 rw,relatime,stripe=16 0 0
As this persistent volume is mounted at /dev/rbd0
, check with df
to see how they are used:
df -h /dev/rbd0
# df -h /dev/rbd0
Filesystem Size Used Avail Use% Mounted on
/dev/rbd0 10G 7.7G 2.3G 4% /var/lib/kubelet/pods/844f4c70-3d8d-4688-872c-c0220c5de54a/volume-subpaths/pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d/prometheus/2
This now identifies, that this volume would need to be resized to accomodate future growth.
-
Next, Create a direct mount deployment by running
kubectl apply -f https://raw.githubusercontent.com/rook/rook/v1.5.12/cluster/examples/kubernetes/ceph/direct-mount.yaml
If you are running an airgapped environment, you can achieve the same by performing the following steps:
1a.wget https://raw.githubusercontent.com/rook/rook/v1.5.12/cluster/examples/kubernetes/ceph/direct-mount.yaml
1b.kubectl apply -f direct-mount.yaml
-
Then, we need to find the volume that needs to be mapped by retrieving the CSI Volume information. This can be done by describing the PV as follows:
kubectl describe pv <pv-name> | grep csi-vol
which will then output:# kubectl describe pv pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d | grep csi-vol imageName=csi-vol-aec05168-409f-11ed-b64d-666f68140e1c
-
So, In this example we’ve determined that the CSI volume is:
csi-vol-aec05168-409f-11ed-b64d-666f68140e1c
-
Next, find out what the pod name is for the direct mount pod by running:
# kubectl get pods -n rook-ceph | grep rook-direct-mount
rook-direct-mount-797978954-n9mb5
-
Next, Exec into the direct mount pod by running:
kubectl exec -it rook-direct-mount-797978954-n9mb5 -n rook-ceph -- bash
-
Then, we expand the desired PVC with
rbd --pool=replicapool resize csi-vol-aec05168-409f-11ed-b64d-666f68140e1c --size=20Gb
-
The next step is to map the RBD in the Direct Mount pod -
rbd map replicapool/csi-vol-aec05168-409f-11ed-b64d-666f68140e1c
-
This will also print the new value for
/dev/rbd<N>
that can be used in the next step(s). -
Following up, the rbd now needs to be mounted, so we can resize the filesystem
mount /dev/rbd<N> /mnt
If your volumes have ran out of space and resizing them is a follow up task, you will need to run fsck with fsck /dev/rbd<num>
NOTE: This will allow you to recover, but if fsck is needed, potential data loss may have already occurred.
- Next up is the operation to grow, or resize the filesystem. Please note that you cannot resize or shrink the filesystem to anything less
than the current size.
Depending on the type of your filesystem (XFS or an ext filesystem, we can use either xfs_growfs
for XFS or resize2fs
for an ext filesystem.
Run the following command for XFS: xfs_growfs /mnt
or resize2fs /dev/rbd<N>
for an ext type filesystem.
- Then, unmount the volume by running:
umount /mnt
and unmap it by running:rbd unmap /dev/rbd<N>
You may also need to grow the filesystem from the host mounting the Pod with:
cat /etc/mtab | grep pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d
to get the correct rbd information.
Then, run either xfs_growfs /dev/rbd<N>
or you will need to use resize2fs
it is an ext filesystem: resize2fs /dev/rbd<N>
- As a final step in this process, we will need to let the active PV know that it has been resized (It will currently still reflect the old size) which can be seen when running:
kubectl describe pv pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d | grep Capacity
This will show you that it is indeed still at the previous value of 10Gb:
# kubectl describe pv pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d | grep Capacity
Capacity: 10Gi
By editing the PV and changing the value you will perform the last step in the resize process:
kubectl edit pv pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d
And change the value at:
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 10Gi
to:
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
Saving, and exiting the editor, shows you the message persistentvolume/pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d edited
Upon checking the new value with kubectl
you will see the new size is now also reflected:
kubectl get pv -A | grep pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d
pvc-8f9199e9-539e-483e-b2e3-6ebf2098055d 20Gi RWO Delete Bound monitoring/prometheus-k8s-db-prometheus-k8s-0 default 3d3h
As always, please be mindful when running operations like this and, if you are unsure about what to run and how to run it, please consult Replicated via a support ticket.
DISCLAIMER
This procedure is unsupported by Replicated Support and only to be followed by knowledgeable experts. It solely serves as advice on how to perform a specific Linux/Kubernetes base technology task