How to clean cephfs residual data
Attention: If user hits the capacity issue in production environment, please contact with Support first to avoid the data lose. This document will provide the reference for test/developer purpose.
Background:
ODF provides unified storage, the cephfs storage is provisioned by storageclass ocs-storagecluster-cephfs. The cephfs pv space will be reclaimed automatically, but in some error cases (such as snapshot failure, or clone pending, etc), the cephfs will have some residual data, and this residual data will occupy ceph capacity. The worst case is that the capacity might run out, to recover the pv access, user needs to add more storage. For test purpose, user can delete some data to free the capacity.
Summary:
1. This document will use busy box as the sample application.
2. This document includes 3 parts, it will provide the steps to clean up Kubernetes resources, clean up cephfs data and verification.
Part 1: Clean up k8s resources.
1.1 If user hits this issue with snapshot/clone operators, need to check the pvc status first.In most backup/restore scenarios, the clone action most probably cause the capacity runs out. If there are pending pvcs ,user need to delete them first. That is because the clone operation will keep consuming the capacity if we lift the ratio. Here is a sample to show the pvc is cloned from a snapshot.
spec:
accessModes:
- ReadWriteMany
dataSource:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: busybox-pvc-snapshot1
dataSourceRef:
apiGroup: snapshot.storage.k8s.io
kind: VolumeSnapshot
name: busybox-pvc-snapshot1
Need to confirm the pvc is not used before deleting them.
oc delete pvc -n busybox busybox-restore-pvc-1 busybox-restore-pvc-2
persistentvolumeclaim "busybox-restore-pvc-1" deleted
persistentvolumeclaim "busybox-restore-pvc-2" deleted
1.2 Delete all the unused cephfs volume snapshots.
oc get volumesnapshot -n <app-ns> | grep ocs-storagecluster-cephfsplugin-snapclass
Example output
[f07@mc32 ~]$ oc get volumesnapshot -n busybox | grep ocs-storagecluster-cephfsplugin-snapclass
busybox busybox-pvc-snapshot-2 true busybox-pvc 200Gi ocs-storagecluster-cephfsplugin-snapclass snapcontent-c3bcc81b-1caa-4b8e-9b8c-8bce433e76d1 9d 9d
The delete the volume snaphsot with
oc delete volumesnapshot -n <app-ns> <vol-snapshot-name>
If the delete action is stuck due to delete operator, need to patch snapshot instance to remove the finalizer.
oc patch volumesnapshot -n <app-ns> <vol-snapshot-name> --patch '{ "metadata": { "finalizers": [] } }' --type=merge
1.3 Delete all the residual volumesnapshotcontents.
If the volumesnapshotcontents is not deleted when volumesnapshot is patched, the volumesnapshot needs to be clean manually.
oc patch volumesnapshotcontents <vol-snapshot-content-name> --patch '{ "metadata": { "finalizers": [] } }' --type=merge
oc delete volumesnapshotcontents <vol-snapshot-content-name>
1.4 Delete the residual volumesnapshotcontents.
oc get volumesnapshotcontents | grep cephfs
oc delete volumesnapshotcontents <contents name>
1.5 Take a note for the original source pv details,we should avoid to delete this pv because it stores useful data.
oc get pv <pv-name> -o yaml
Example output
csi:
volumeAttributes:
clusterID: openshift-storage
fsName: ocs-storagecluster-cephfilesystem
storage.kubernetes.io/csiProvisionerIdentity: 1694767350102-3975-openshift-storage.cephfs.csi.ceph.com
subvolumeName: csi-vol-5514e5bc-e257-4584-8f47-849f2144a47e
subvolumePath: /volumes/csi/csi-vol-5514e5bc-e257-4584-8f47-849f2144a47e/606d933c-f2c5-44e8-990f-2ca863847395
volumeHandle: 0001-0011-openshift-storage-0000000000000001-5514e5bc-e257-4584-8f47-849f2144a47e
Part 2: Clean up cephfs data
2.1 Enable the ceph tool.
oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
2.2 Increase the capacity full-ratio to 88% (Default full-ratio is 85%)
oc rsh -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-operator) ceph osd set-full-ratio 0.88 -c /var/lib/rook/openshift-storage/openshift-storage.config
Example output
osd set-full-ratio 0.88
2.3 Login ceph tools command line window.
oc rsh -n openshift-storage $(oc get pods -n openshift-storage -l app=rook-ceph-tools -o name)
2.4 Check the Subvolumes’ clone status. If the clone status is in-progress, user needs to cancel it first.
Example output.
From this output, its related subvolume name and snapshot name are displayed.
…
Subvolume : csi-vol-e5ce8e38-78bd-4918-a58f-87311fd82149
{
"status": {
"state": "in-progress",
"source": {
"volume": "ocs-storagecluster-cephfilesystem",
"subvolume": " csi-vol-fc655f9d-dac0-487d-8abe-5713eb846c10",
"snapshot": "csi-snap-fc242fac-bb56-475a-aaf7-d20c6b6ed5fb",
"group": "csi"
}
}
}
Execute the command below to cancel the clone progress.
ceph fs clone cancel ocs-storagecluster-cephfilesystem <subvolume-name> csi
This command will cancel all the clones.
for i in `ceph fs subvolume ls ocs-storagecluster-cephfilesystem csi --format json | jq '.[] | .name' | cut -f 2 -d '"'`; do echo "Subvolume : $i"; ceph fs clone cancel ocs-storagecluster-cephfilesystem $i csi; done
Query the clones’ details. Only the clones which in failed or canceled status can be removed.
for i in `ceph fs subvolume ls ocs-storagecluster-cephfilesystem csi --format json | jq '.[] | .name' | cut -f 2 -d '"'`; do echo "Subvolume : $i"; ceph fs clone status ocs-storagecluster-cephfilesystem $i csi; done
Delete the unused clones. The subvolume name is from the detail outoput of previous clone status.
ceph fs clone rm ocs-storagecluster-cephfilesystem <subvolume-name> csi
Example
ceph fs subvolume rm ocs-storagecluster-cephfilesystem csi-vol-cac7a079-4cf4-4f7d-9940-ecc84a32742c csi --force
2.5 If some subvolume has snapshot, user needs to delete the snapshot first. Get the snapshot name from previous step, then execute the command below.
sh-5.1$ ceph fs subvolume snapshot ls ocs-storagecluster-cephfilesystem csi-vol-fc655f9d-dac0-487d-8abe-5713eb846c10 csi
[
{
"name": "csi-snap-fc242fac-bb56-475a-aaf7-d20c6b6ed5fb"
}
]
2.6 Delete the snapshot associated with this subvolume.
ceph fs subvolume snapshot rm ocs-storagecluster-cephfilesystem <subvolume-name> <snapshot-name> csi
Example
sh-5.1$ ceph fs subvolume snapshot rm ocs-storagecluster-cephfilesystem csi-vol-fc655f9d-dac0-487d-8abe-5713eb846c10 csi-snap-fc242fac-bb56-475a-aaf7-d20c6b6ed5fb csi
2.7 After snapshot is deleted, delete the clone again.
Part 3: Verification
3.1 All the delete operators are asynchronous operations. The capacity will be reclaimed in short time. Use ceph df to see the capacity usage.
Example output
sh-5.1$ ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 600 GiB 571 GiB 29 GiB 29 GiB 4.86
TOTAL 600 GiB 571 GiB 29 GiB 29 GiB 4.86
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
ocs-storagecluster-cephblockpool 1 32 209 MiB 112 628 MiB 0.12 166 GiB
.mgr 2 1 577 KiB 2 1.7 MiB 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.log 3 8 32 KiB 340 1.9 MiB 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.non-ec 4 8 0 B 0 0 B 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.meta 5 8 3.8 KiB 14 124 KiB 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.otp 6 8 0 B 0 0 B 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.control 7 8 0 B 8 0 B 0 166 GiB
.rgw.root 8 8 5.8 KiB 16 180 KiB 0 166 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.index 9 8 0 B 11 0 B 0 166 GiB
ocs-storagecluster-cephfilesystem-metadata 10 16 383 MiB 126 1.1 GiB 0.22 166 GiB
ocs-storagecluster-cephobjectstore.rgw.buckets.data 11 32 1 KiB 1 12 KiB 0 166 GiB
ocs-storagecluster-cephfilesystem-data0 12 32 7.8 GiB 3.93k 24 GiB 4.50 166 GiB
3.2 Verify the capacity usage from ocp console.
3.3 After the space is reclaimed, change the full-ratio back to 85%.
oc rsh -n openshift-storage $(oc get pods -n openshift-storage -o name -l app=rook-ceph-operator) ceph osd set-full-ratio 0.85 -c /var/lib/rook/openshift-storage/openshift-storage.config
Example output
osd set-full-ratio 0.85