The Hidden Challenge in Kubernetes Backup & Restore
Backup and restore in Kubernetes is often discussed in terms of tools - Velero, OADP, snapshots, object storage - but the hardest part is frequently overlooked - Knowing exactly what to back up and what to restore.
In real-world clusters:
- Applications span dozens of resource types
- Operators create resources dynamically
- Some resources are cluster-scoped and other namespace-scoped
- Labels are inconsistent or missing
If resources are missed during backup, restores appear “successful” but applications still fail.
This is where resource discovery becomes critical, and where tools like the get-resources kubectl plugin add significant value.
Why Resource Inventory Matters for Backup & Restore?
A Kubernetes application is not just Deployments and Services. It may include:
- CustomResourceDefinitions (CRDs)
- Custom resources managed by operators
- RBAC objects
- Webhooks
- ConfigMaps and Secrets created at runtime
Without a complete inventory, backup tools can:
- Skip critical objects
- Restore incomplete application states
- Leave behind orphaned resources
The get-resources plugin helps solve this by building a precise inventory of application-related resources, regardless of how they were created.
Backup Use Case #1: Defining the True Backup Scope
Before you run a backup, you need answers to:
- What resources belong to this application?
- Are there cluster-scoped dependencies?
- Were resources created dynamically after installation?
Using get-resources, you can:
- Enumerate all namespace-scoped and cluster-scoped resourcesTo retrieve all resources, run
oc get-resources >all_resources.csv
To know all cluster scope resources
awk -F',' 'NR==1 || $4==""' all_resources.csv | wc -l
To know all namespace scope resources
awk -F',' 'NR==1 || $4!=""' all_resources.csv | wc -l
- Filter by creation timestamp or namespaceLet’s assume you deployed an application in the “demo” namespace that includes both cluster-scoped and namespace-scoped resources. In this scenario, we can retrieve all application resources using
oc get-resources --namespace demo --after=<namespace creationTimestamp>
oc get-resources --namespace demo --after=$(oc get namespace demo -o jsonpath='{.metadata.creationTimestamp}')
Want to know all resources of a namespace
oc get-resources --namespace demo
This can be used to query multiple namespace resources excluding cluster resources
oc get-resources --namespace demo --namespace=default --exclude-cluster-resources=true
- Export the results in machine-readable formats
oc get-resources --namespace demo
OR
oc get-resources --namespace demo --start=<start_timestamp> --end=<end_timestamp>
OR
oc get-resources --namespace demo --after=<after timestamp> --resource-data=true
OR
oc get-resources --namespace demo --before=<before timestamp> --output=demo_resources
OR
oc get-resources --namespace demo --namespace=default --after=<namespace creationTimestamp> --output=demo_resources
- Check help for details
oc get-resources --help
This allows backup workflows to be:
- Data-driven instead of assumption-driven
- More predictable
- Easier to automate
Instead of guessing what Velero should include, you know exactly what exists.
Backup Use Case #2: Supporting Label-Less and Legacy Applications
Many backup strategies rely on labels:
app=myapp
But in practice:
- Older applications may not use labels
- Operators may create unlabelled resources
- Third-party components may not follow conventions
- The get-resources plugin does not rely solely on labels.
It identifies resources based on actual cluster state, making it ideal for:
- Legacy workloads
- Vendor operators
- Complex platforms like OpenShift add-ons
Once, we know all resources of an application, we can label them as per our need. This ensures no silent data loss during backup.
Restore Use Case #1: Restore Validation and Confidence
A restore is not complete just because the command finished successfully.
After restore, users need to know:
- Was every resource recreated?
- Are annotations and owner references intact?
- Are cluster-scoped resources present?
By capturing a pre-backup inventory and a post-restore inventory, get-resources enables:
- Side-by-side comparisons
- Automated diff checks
- Restore validation in CI pipelines
This turns restore testing from a manual process into a repeatable verification step.
Restore Use Case #2: Disaster Recovery Drills
Disaster recovery exercises often fail due to:
- Missing CRDs
- Restored workloads without permissions
- Incomplete operator recovery
With get-resources, users can:
- Capture a baseline resource inventory
- Simulate failure (cluster or namespace deletion)
- Restore from backup
- Compare restored resources against baseline
This highlights exactly what the DR plan missed, long before a real outage occurs.
Restore Use Case #3: Clean Rebuilds and Environment Migration
When restoring applications into:
- New clusters
- Different environments (dev --> prod)
- Fresh OpenShift installations
It’s critical to ensure:
- No stale resources remain
- Only required objects are restored
Resource inventories help users:
- Identify environment-specific objects
- Exclude or transform non-portable resources
- Confirm that restores align with target cluster policies
For instance, restore workflows that require domain transformations during recovery can be easily identified and handled
- Filter resources from the inventory that reference the domain in their specification
- Apply any required domain transformations before performing the restore
Backup Validation Use Case: Detecting Orphaned Resources
Uninstalls and failed restores often leave behind:
- Unused RBAC objects
- Webhooks
- Finalizers that block deletion
Running get-resources before and after backup/restore operations helps:
- Detect resource leaks
- Validate cleanup logic
- Improve uninstall and rollback procedures
This improves cluster hygiene and long-term stability.
Using Resource Inventories in Automation
Exported outputs (CSV/YAML/JSON) can be:
- Fed into backup scripts
- Used in GitOps workflows
- Analyzed offline for compliance or audit purposes
For platform users, this becomes a foundation layer for:
- Backup policy enforcement
- Restore validation pipelines
- Multi-cluster disaster recovery strategies
Conclusion
Kubernetes backup and restore failures are rarely the result of the backup tool itself; they usually stem from incomplete visibility into what actually exists in the cluster. By introducing a reliable resource discovery step using tools like the get-resources plugin, users gain a clear and comprehensive understanding of application dependencies and cluster state. This enables them to back up with confidence, verify restores accurately, conduct realistic disaster recovery testing, and eliminate blind spots in complex Kubernetes environments. In modern Kubernetes platforms, successful backup and restore is not just about protecting data - it is fundamentally about maintaining visibility and control.