DevOps Automation

DevOps Automation

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Mastering Cloud Storage: Automated Cleanup of Orphaned Resources in IBM Cloud Kubernetes Service

By Hiren Dave posted 20 days ago

  

Cloud environments—especially those leveraging Kubernetes—are dynamic. Resources are provisioned and de-provisioned continuously. While this agility is a significant advantage, it can lead to resource sprawl, specifically orphaned storage volumes that are no longer in use but continue to incur costs and create management overhead. This article details an automated approach using Bash scripts to identify and manage orphaned storage resources within IBM Cloud, focusing on IBM Cloud Kubernetes Service (IKS) and its underlying classic (SoftLayer) and VPC storage.

 

1. The Persistent Problem: Orphaned Storage and Its Impact

 

When a PersistentVolumeClaim (PVC) in Kubernetes is deleted, the corresponding PersistentVolume (PV) might enter a Released state. Depending on its reclaim policy, the underlying physical storage might not be automatically deleted. Similarly, storage volumes might be provisioned directly in IBM Cloud for testing or temporary use and never formally attached to a Kubernetes cluster, or their associated clusters might be decommissioned without proper storage cleanup.

 

These orphaned volumes can lead to:

  • Unnecessary Costs: Paying for storage that provides no value.
  • Security Risks: Unmanaged storage might contain sensitive data.
  • Management Overhead: A cluttered inventory makes it harder to manage active resources.
  • Quota Issues: Consuming storage quotas that could be used for active workloads.

 

2. The Solution: A Two-Pronged Automation Strategy

 

We’ll explore two complementary automation scripts:

  1. PV Cleaner: Targets Kubernetes PVs in a Released state and attempts to delete both the PV object and its corresponding physical backing storage in IBM Cloud.
  2. Orphaned Volume Manager: Scans for IBM Cloud storage volumes (Classic Block/File, VPC Block) that are not actively referenced by any PV in any of your IKS clusters. This script can be used for identification and, with caution, for automated deletion.

 

3. Prerequisites

  

  • IBM Cloud CLI (ibmcloud): Installed and configured. Docs.
  • Kubernetes CLI (kubectl): Installed and configured. Docs.
  • jq: Install via sudo apt-get install jq or brew install jq.
  • Permissions:
    • IBM Cloud IAM roles to list clusters and manage storage.
    • Kubernetes RBAC to get and delete PersistentVolumes.
  • Environment Variables (optional):
    • DRY_RUN: true to simulate or false to execute.
    • DELETE_OLDER_THAN_DAYS: Age threshold for deletion (e.g., 7 days).

 

4. Part 0: Logging into IBM Cloud

  

Interactive Login

ibmcloud login
# or with SSO
ibmcloud login --sso

  

API Key Login (CI/CD)

export IBMCLOUD_API_KEY="YOUR_API_KEY"
export IBMCLOUD_REGION="YOUR_REGION"
export IBMCLOUD_RESOURCE_GROUP="YOUR_RESOURCE_GROUP"

ibmcloud login --apikey "${IBMCLOUD_API_KEY}" \
               -r "${IBMCLOUD_REGION}" \
               -g "${IBMCLOUD_RESOURCE_GROUP}"

 

5. Part 1: Cleaning Up Released PersistentVolumes

   

Core Logic

  

  1. Identify Target Clusters
    clusters=$(ibmcloud ks cluster ls -q 2>/dev/null | awk 'NR>1 && NF>0 {print $1}' || true)
  2. Iterate and Configure
    for cluster in $clusters; do
      ibmcloud ks cluster config --cluster "$cluster"
      # kubectl context set...
    done
  3. Find Released PVs
    released_pvs=$(kubectl get pv --no-headers 2>/dev/null | awk '$5=="Released" {print $1}' || true)
  4. Extract Volume Details
    # Classic Block Storage
    volume_id=$(kubectl get pv "$pv" -o jsonpath='{.spec.flexVolume.options.VolumeID}')
    # VPC Block Storage
    volume_id=$(kubectl get pv "$pv" -o jsonpath='{.spec.csi.volumeHandle}')
  5. Delete Physical Volume
    DRY_RUN="${DRY_RUN:-true}"
    if [[ "$DRY_RUN" == "false" ]]; then
      ibmcloud sl block volume-cancel "$volume_id" --immediate -f
    else
      echo "DRY_RUN: Would delete physical volume $volume_id"
    fi
  6. Delete PV Object
    if [[ "$DRY_RUN" == "false" ]]; then
      kubectl delete pv "$pv"
    else
      echo "DRY_RUN: Would delete Kubernetes PV $pv"
    fi
  7. Logging to pv_cleanup_report.csv and pv_cleanup.log.

 

6. Part 2: Identifying & Managing Orphaned Cloud Volumes

   

Core Logic

   

  1. Collect Known Kubernetes Volume IDs
    declare -A all_volume_ids
    for cluster in $clusters; do
      ibmcloud ks cluster config --cluster "$cluster"
      for pv in $(kubectl get pv -o jsonpath='{range .items[*]}{.metadata.name}\n{end}'); do
        vid=$(kubectl get pv "$pv" -o jsonpath='{.spec.csi.volumeHandle // .spec.flexVolume.options.VolumeID}')
        [[ -n "$vid" && "$vid" != "null" ]] && all_volume_ids["$vid"]="$cluster:$pv"
      done
    done
  2. List Physical Cloud Volumes
    cb_vols=$(ibmcloud sl block volume-list --output json)
    cf_vols=$(ibmcloud sl file volume-list --output json)
    vpc_vols=$(ibmcloud is volumes --output json)
  3. Identify Potential Orphans by checking IDs not in all_volume_ids.
  4. Secondary Verification across all clusters with kubectl & jq.
  5. Calculate Volume Age
    creation_epoch=$(date -d "$creation_date" +%s)
    now_epoch=$(date +%s)
    age_days=$(((now_epoch - creation_epoch) / 86400))
  6. Automated Deletion with safeguards:
    if [[ "$pv_check" == "ORPHANED" ]]; then
      if [[ "$DRY_RUN" == "true" ]]; then
        delete_status="DRY_RUN_WOULD_DELETE"
      elif (( age_days < DELETE_OLDER_THAN_DAYS )); then
        delete_status="TOO_RECENT"
      else
        delete_volume "$volume_type" "$volume_id"
        delete_status="DELETED"
      fi
      echo "$volume_id,$delete_status,$age_days" >> orphaned_report.csv
    fi
  7. Logging to orphaned_report.csv with detailed fields.

 

7. Use Case Scenarios

  

  • Regular cost optimization (nightly/weekly runs).
  • Post-migration cleanup.
  • Environment decommissioning.
  • Storage auditing in DRY_RUN mode.
  • Pre-prod/dev environment hygiene.

 

8. Automating with CI/CD Pipelines

  

GitLab CI Example

# .gitlab-ci.yml
stages:
  - cleanup

variables:
  DRY_RUN_MODE: "true"
  DELETE_DAYS:    "7"

orphaned_volume_check:
  stage: cleanup
  image: alpine
  before_script:
    - apk add --no-cache bash curl jq
    - curl -fsSL https://clis.cloud.ibm.com/install/linux | sh
    - curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" \
      && chmod +x kubectl && mv kubectl /usr/local/bin/
    - ibmcloud login --apikey "${IBMCLOUD_API_KEY}" -r "us-south"
  script:
    - export DRY_RUN=${DRY_RUN_MODE}
    - export DELETE_OLDER_THAN_DAYS=${DELETE_DAYS}
    - bash ./find-orphaned-volumes.sh
  artifacts:
    paths:
      - orphaned_volumes.csv
      - orphaned_volumes.log
    when: always
  rules:
    - if: $CI_PIPELINE_SOURCE == "schedule"

 

9. Important Considerations & Best Practices

  

  • Dry Runs First: Always test with DRY_RUN=true.
  • Incremental Rollout: Start with non-critical clusters.
  • Least Privilege: Grant minimal permissions.
  • Robust Logging & Alerting: Enhance scripts with notifications.
  • Human Review: Verify orphan lists before deletion.
  • Grace Periods: Use sensible thresholds (7–30 days).
  • Understand Reclaim Policies: Retain vs. delete behaviors.
  • Error Handling: Use set -euo pipefail and retries.

 

10. Conclusion

  

Automating the cleanup of Released Kubernetes PVs and truly orphaned cloud storage volumes is vital for maintaining a cost-effective, secure, and manageable IBM Cloud environment. By leveraging these Bash scripts—customized to your needs and integrated into CI/CD workflows—you can significantly reduce storage sprawl and optimize your cloud resources. Always prioritize safety through dry runs, least privilege permissions, and thorough review processes.

0 comments
12 views

Permalink