Cloud environments—especially those leveraging Kubernetes—are dynamic. Resources are provisioned and de-provisioned continuously. While this agility is a significant advantage, it can lead to resource sprawl, specifically orphaned storage volumes that are no longer in use but continue to incur costs and create management overhead. This article details an automated approach using Bash scripts to identify and manage orphaned storage resources within IBM Cloud, focusing on IBM Cloud Kubernetes Service (IKS) and its underlying classic (SoftLayer) and VPC storage.
1. The Persistent Problem: Orphaned Storage and Its Impact
When a PersistentVolumeClaim (PVC) in Kubernetes is deleted, the corresponding PersistentVolume (PV) might enter a Released state. Depending on its reclaim policy, the underlying physical storage might not be automatically deleted. Similarly, storage volumes might be provisioned directly in IBM Cloud for testing or temporary use and never formally attached to a Kubernetes cluster, or their associated clusters might be decommissioned without proper storage cleanup.
These orphaned volumes can lead to:
- Unnecessary Costs: Paying for storage that provides no value.
- Security Risks: Unmanaged storage might contain sensitive data.
- Management Overhead: A cluttered inventory makes it harder to manage active resources.
- Quota Issues: Consuming storage quotas that could be used for active workloads.
2. The Solution: A Two-Pronged Automation Strategy
We’ll explore two complementary automation scripts:
- PV Cleaner: Targets Kubernetes PVs in a Released state and attempts to delete both the PV object and its corresponding physical backing storage in IBM Cloud.
- Orphaned Volume Manager: Scans for IBM Cloud storage volumes (Classic Block/File, VPC Block) that are not actively referenced by any PV in any of your IKS clusters. This script can be used for identification and, with caution, for automated deletion.
3. Prerequisites
- IBM Cloud CLI (
ibmcloud
): Installed and configured. Docs.
- Kubernetes CLI (
kubectl
): Installed and configured. Docs.
jq
: Install via sudo apt-get install jq
or brew install jq
.
- Permissions:
- IBM Cloud IAM roles to list clusters and manage storage.
- Kubernetes RBAC to
get
and delete
PersistentVolumes.
- Environment Variables (optional):
DRY_RUN
: true
to simulate or false
to execute.
DELETE_OLDER_THAN_DAYS
: Age threshold for deletion (e.g., 7
days).
4. Part 0: Logging into IBM Cloud
Interactive Login
ibmcloud login
# or with SSO
ibmcloud login --sso
API Key Login (CI/CD)
export IBMCLOUD_API_KEY="YOUR_API_KEY"
export IBMCLOUD_REGION="YOUR_REGION"
export IBMCLOUD_RESOURCE_GROUP="YOUR_RESOURCE_GROUP"
ibmcloud login --apikey "${IBMCLOUD_API_KEY}" \
-r "${IBMCLOUD_REGION}" \
-g "${IBMCLOUD_RESOURCE_GROUP}"
5. Part 1: Cleaning Up Released PersistentVolumes
Core Logic
- Identify Target Clusters
clusters=$(ibmcloud ks cluster ls -q 2>/dev/null | awk 'NR>1 && NF>0 {print $1}' || true)
- Iterate and Configure
for cluster in $clusters; do
ibmcloud ks cluster config --cluster "$cluster"
# kubectl context set...
done
- Find Released PVs
released_pvs=$(kubectl get pv --no-headers 2>/dev/null | awk '$5=="Released" {print $1}' || true)
- Extract Volume Details
# Classic Block Storage
volume_id=$(kubectl get pv "$pv" -o jsonpath='{.spec.flexVolume.options.VolumeID}')
# VPC Block Storage
volume_id=$(kubectl get pv "$pv" -o jsonpath='{.spec.csi.volumeHandle}')
- Delete Physical Volume
DRY_RUN="${DRY_RUN:-true}"
if [[ "$DRY_RUN" == "false" ]]; then
ibmcloud sl block volume-cancel "$volume_id" --immediate -f
else
echo "DRY_RUN: Would delete physical volume $volume_id"
fi
- Delete PV Object
if [[ "$DRY_RUN" == "false" ]]; then
kubectl delete pv "$pv"
else
echo "DRY_RUN: Would delete Kubernetes PV $pv"
fi
- Logging to
pv_cleanup_report.csv
and pv_cleanup.log
.
6. Part 2: Identifying & Managing Orphaned Cloud Volumes
Core Logic
- Collect Known Kubernetes Volume IDs
declare -A all_volume_ids
for cluster in $clusters; do
ibmcloud ks cluster config --cluster "$cluster"
for pv in $(kubectl get pv -o jsonpath='{range .items[*]}{.metadata.name}\n{end}'); do
vid=$(kubectl get pv "$pv" -o jsonpath='{.spec.csi.volumeHandle // .spec.flexVolume.options.VolumeID}')
[[ -n "$vid" && "$vid" != "null" ]] && all_volume_ids["$vid"]="$cluster:$pv"
done
done
- List Physical Cloud Volumes
cb_vols=$(ibmcloud sl block volume-list --output json)
cf_vols=$(ibmcloud sl file volume-list --output json)
vpc_vols=$(ibmcloud is volumes --output json)
- Identify Potential Orphans by checking IDs not in
all_volume_ids
.
- Secondary Verification across all clusters with
kubectl
& jq
.
- Calculate Volume Age
creation_epoch=$(date -d "$creation_date" +%s)
now_epoch=$(date +%s)
age_days=$(((now_epoch - creation_epoch) / 86400))
- Automated Deletion with safeguards:
if [[ "$pv_check" == "ORPHANED" ]]; then
if [[ "$DRY_RUN" == "true" ]]; then
delete_status="DRY_RUN_WOULD_DELETE"
elif (( age_days < DELETE_OLDER_THAN_DAYS )); then
delete_status="TOO_RECENT"
else
delete_volume "$volume_type" "$volume_id"
delete_status="DELETED"
fi
echo "$volume_id,$delete_status,$age_days" >> orphaned_report.csv
fi
- Logging to
orphaned_report.csv
with detailed fields.
7. Use Case Scenarios
- Regular cost optimization (nightly/weekly runs).
- Post-migration cleanup.
- Environment decommissioning.
- Storage auditing in
DRY_RUN
mode.
- Pre-prod/dev environment hygiene.
8. Automating with CI/CD Pipelines
GitLab CI Example
# .gitlab-ci.yml
stages:
- cleanup
variables:
DRY_RUN_MODE: "true"
DELETE_DAYS: "7"
orphaned_volume_check:
stage: cleanup
image: alpine
before_script:
- apk add --no-cache bash curl jq
- curl -fsSL https://clis.cloud.ibm.com/install/linux | sh
- curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" \
&& chmod +x kubectl && mv kubectl /usr/local/bin/
- ibmcloud login --apikey "${IBMCLOUD_API_KEY}" -r "us-south"
script:
- export DRY_RUN=${DRY_RUN_MODE}
- export DELETE_OLDER_THAN_DAYS=${DELETE_DAYS}
- bash ./find-orphaned-volumes.sh
artifacts:
paths:
- orphaned_volumes.csv
- orphaned_volumes.log
when: always
rules:
- if: $CI_PIPELINE_SOURCE == "schedule"
9. Important Considerations & Best Practices
- Dry Runs First: Always test with
DRY_RUN=true
.
- Incremental Rollout: Start with non-critical clusters.
- Least Privilege: Grant minimal permissions.
- Robust Logging & Alerting: Enhance scripts with notifications.
- Human Review: Verify orphan lists before deletion.
- Grace Periods: Use sensible thresholds (7–30 days).
- Understand Reclaim Policies: Retain vs. delete behaviors.
- Error Handling: Use
set -euo pipefail
and retries.
10. Conclusion
Automating the cleanup of Released Kubernetes PVs and truly orphaned cloud storage volumes is vital for maintaining a cost-effective, secure, and manageable IBM Cloud environment. By leveraging these Bash scripts—customized to your needs and integrated into CI/CD workflows—you can significantly reduce storage sprawl and optimize your cloud resources. Always prioritize safety through dry runs, least privilege permissions, and thorough review processes.