Meet the Players: PVCs and PVs
Think of Kubernetes like a bustling city and your applications as the businesses operating within it. Persistent Volumes (PVs) are like storage warehouses scattered across the city, while Persistent Volume Claims (PVCs) are the rental agreements businesses use to secure space in those warehouses. Much like a business rents warehouse space without needing to build it themselves, applications running on Kubernetes can request storage without worrying about the underlying infrastructure. This setup allows applications to retain data even when pods restart or fail. However, just like warehouse issues can disrupt a business, problems with PVCs and PVs can disrupt workloads, making effective monitoring and troubleshooting critical.
Persistent Volumes (PVs): Storage resources provisioned at the cluster level, either statically or dynamically.
Persistent Volume Claims (PVCs): Requests for storage made by applications, specifying capacity, access modes, and storage class requirements.
The Stakes are High: Why Monitoring Matters
Building upon the statement around disrupting workloads, let's explore some common reasons and why monitoring PVCs and PVs should be a priority for any Kubernetes administrator, SRE, or application developer.
Optimizing Storage Performance and Scalability
The performance of the underlying storage can directly affect application responsiveness. Bottlenecks in storage backends, high latency, or insufficient IOPS can degrade the performance of applications relying on these PVCs.
It's useful to monitor storage performance metrics (e.g., latency
, IOPS
, throughput
) to stay proactive.
It also allows for capacity planning, ensuring the storage backend can handle the workload demand effectively. As Kubernetes clusters grow, so do the demands on storage. Monitoring allows for:
- Tracking storage usage trends over time
- Planning for future capacity requirements
- Ensuring your storage architecture scales efficiently with the cluster
Without monitoring, scaling storage to meet demand becomes reactive, leading to potential outages or performance degradation.
Avoiding Resource Waste
Unmonitored PVCs and PVs can lead to resource inefficiencies in the cluster. Common scenarios include:
- Orphaned PVCs or PVs: When namespaces are deleted without properly cleaning up associated PVCs or PVs, these resources remain unused but continue to consume storage.
- Underutilized Storage: PVs that are partially filled or PVCs requesting more capacity than needed result in inefficient utilization.
By tracking PVC and PV utilization metrics, unused or underutilized resources can be identified and reclaimed, freeing up capacity.
Building off of this knowledge, the "reclaim policy" of a PV (Retain
, Recycle
, or Delete
) determines what happens to the data when a PVC is deleted. Without proper monitoring, PVs with a Retain
policy may accumulate and create unnecessary overhead, especially in dynamic environments where there's high churn of pods or namespaces.
Monitoring reclaim policies and the state of released PVs allows you to clean up unused resources efficiently and avoid storage clutter.
Ensuring Application Reliability
At the end of the day, keeping applications running and performant is everything.
Stateful applications, such as databases, rely on persistent storage for their functionality. If PVCs become unbound to a PVC or PVs become inaccessible due to configuration issues or underlying storage failures, the application may experience downtime.
Monitoring PVC binding statuses and storage health metrics ensures applications continue to function seamlessly. Alerting on critical failures, such as the availability state of a PV
, allows for quick intervention to keep applications available.
Metric Magic: The Top Indicators for PVCs and PVs
Kubernetes provides an expansive list of metrics to monitor every part of the cluster. Additionally, using kube-state-metrics augments this set of metrics further.
To effectively monitor PVCs and PVs, it's important to look at the following:
Purpose/Use Case |
Metric |
Source |
PVC Binding Status: Track PVCs that are stuck in the Pending state |
kube_persistentvolumeclaim_status_phase |
Kube-state-metrics |
PV Capacity Usage: Monitor the used versus available storage in PVs |
kubelet_volume_stats_used_bytes and
kubelet_volume_stats_capacity_bytes
|
Kubernetes stable metrics |
PVC Request vs. PV Provisioned Capacity: Identify mismatches between PVC requests and PV allocations |
kube_persistentvolumeclaim_resource_requests_storage_bytes |
Kube-state-metrics |
Storage Backend Performance: Monitor IOPS, throughput, and latency of the underlying storage system
|
Storage-specific metrics such as VolumeWriteOps (AWS EBS)
|
AWS EBS, GCP Persistent Disk metrics, etc. |
Volume Mount Errors: Track errors when pods try to mount volumes.
|
Pod events/logs by running kubectl describe pod <pod-name>
|
Pod events/logs |
War Stories: Common Troubleshooting Scenarios
Let's go through some of the most commonly seen troubleshooting scenarios when it comes to PVCs and PVs.
Scenario 1: PVC Stuck in Pending State
PVCs that are stuck in a Pending
state indicate that there is no suitable PV available to meet the claim’s requirements and hence, no suitable PV for it to bind to. This can happen due to several reasons:
- The requested storage capacity exceeds available PVs
- Access modes or storage class requirements of the PVC do not match any existing PV
Example
Check the PVC’s details using kubectl describe pvc <pvc-name>
:
Name: example-pvc
Namespace: default
StorageClass: standard
Status: Pending
Volume:
Capacity: 10Gi
Access Modes: ReadWriteOnce
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning ProvisioningFailed 2s (x5 over 10s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
What to Look For:
Status: Pending
indicates the PVC is not yet bound.
- The event with
no persistent volumes available for this claim
highlights the lack of a matching PV.
Solution:
Review the available PVs with kubectl get pv
and ensure they meet the PVC’s requirements:
Name: example-pv
Capacity: 5Gi
Access Modes: ReadWriteMany
ReclaimPolicy: Retain
Status: Available
StorageClass: fast-storage
Observations:
- The PVC requests
10Gi
, but the PV has only 5Gi
.
- The PVC requires
ReadWriteOnce
, but the PV supports only ReadWriteMany
.
If no matching PV exists, create a new PV or adjust the PVC specifications (storage size or access mode)
Scenario 2: Read-Only Volume Mount
This happens when a pod successfully mounts a volume but encounters a "Read-only" file system error during write operations.
Oftentimes, this happens because:
- There's a mismatch in the access mode between the PVC and PV, where the PVC requires a different access mode than what the PV provides
- There's a limitation in the underlying storage backend
Example
Check the PVC’s details using kubectl describe pvc <pvc-name>
:
Name: example-pvc
Namespace: default
StorageClass: standard
Status: Bound
Volume: example-pv
Capacity: 10Gi
Access Modes: ReadOnlyMany
What to Look For: the PVC is bound to a PV with the ReadOnlyMany
access mode, which doesn't allow write operations.
For the sake of this example, let's also view a sample pod YAML extract:
volumes:
- name: data-volume
persistentVolumeClaim:
claimName: example-pvc
volumeMounts:
- mountPath: /data
name: data-volume
Solution
Ensure the PV supports ReadWriteOnce
or ReadWriteMany
and update if necessary. Run: kubectl describe pv <pv-name>
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
storageClassName: standard
persistentVolumeReclaimPolicy: Retain
Finally, if the above doesn't work, confirm that the storage backend supports the requested access mode
Scenario 3: PV in Released State
This occurs when a PV remains in the Released
state even after the associated PVC has been deleted, leaving the PV unusable.
This usually occurs because the PV’s reclaim policy is set to Retain
, meaning the data persists after the PVC is deleted, but the PV isn’t automatically cleaned up or reused.
Example
Check the PV’s details using kubectl describe pv <pv-name>
:
Name: example-pv
Capacity: 10Gi
Access Modes: ReadWriteOnce
Reclaim Policy: Retain
Status: Released
Claim: default/example-pvc
What to look for:
- Reclaim Policy: The PV is set to
Retain
, meaning the data persists, and the PV isn't reused automatically.
- Status:
Released
indicates the PV isn’t available for new PVCs.
Solution
Manually delete the PV if it's not required anymore by running: kubectl delete pv example-pv
For future PVs, set the reclaim policy to Delete
or Recycle
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
storageClassName: standard
persistentVolumeReclaimPolicy: Delete
Scenario 4: Data Not Persisting
This occurs when an application appears to be functioning, but data is not being retained across pod restarts or deployments.
This can occur because of:
- Misconfigured volume mounts in the pod specification
- The PVC is not properly bound to the intended PV
Example
Let's take an example pod YAML extract:
spec:
containers:
- name: app-container
volumeMounts:
- mountPath: /app/data
name: ephemeral-volume
volumes:
- name: ephemeral-volume
emptyDir: {}
What to Look For: emptyDir
Volume Type means the volume is ephemeral and does not retain data beyond the pod’s lifecycle.
Solution
Update the pod to use a PVC-backed volume:
volumes:
- name: persistent-data-volume
persistentVolumeClaim:
claimName: example-pvc
volumeMounts:
- mountPath: /app/data
name: persistent-data-volume
And ensure the PVC is bound by running: kubectl describe pvc example-pvc
:
Name: example-pvc
Namespace: default
Status: Bound
Volume: example-pv
Capacity: 10Gi
Access Modes: ReadWriteOnce
If the above doesn't work, confirm the PVC is bound to the correct PV using kubectl describe pvc <pvc-name>
The Final Takeaway
Monitoring PVCs and PVs in Kubernetes isn’t just about keeping an eye on storage—it’s about ensuring your applications run smoothly, efficiently, and without unnecessary surprises. So, the next time your PVCs or PVs act up, you’ll know exactly where to look and what to do. Observability isn’t just a luxury in Kubernetes — it’s the secret sauce for success.
What’s the trickiest storage issue you’ve faced in Kubernetes and how did you solve it? Let's discuss in the comments! 🚀