Instana

 View Only

Demystifying PVCs and PVs: A Guide to Kubernetes Storage Monitoring

By Jayanth Putta posted Thu November 21, 2024 10:11 PM

  

Meet the Players: PVCs and PVs

Think of Kubernetes like a bustling city and your applications as the businesses operating within it. Persistent Volumes (PVs) are like storage warehouses scattered across the city, while Persistent Volume Claims (PVCs) are the rental agreements businesses use to secure space in those warehouses. Much like a business rents warehouse space without needing to build it themselves, applications running on Kubernetes can request storage without worrying about the underlying infrastructure. This setup allows applications to retain data even when pods restart or fail. However, just like warehouse issues can disrupt a business, problems with PVCs and PVs can disrupt workloads, making effective monitoring and troubleshooting critical.

Persistent Volumes (PVs): Storage resources provisioned at the cluster level, either statically or dynamically. 

Persistent Volume Claims (PVCs): Requests for storage made by applications, specifying capacity, access modes, and storage class requirements.

The Stakes are High: Why Monitoring Matters

Building upon the statement around disrupting workloads, let's explore some common reasons and why monitoring PVCs and PVs should be a priority for any Kubernetes administrator, SRE, or application developer.

Optimizing Storage Performance and Scalability

The performance of the underlying storage can directly affect application responsiveness. Bottlenecks in storage backends, high latency, or insufficient IOPS can degrade the performance of applications relying on these PVCs. 

It's useful to monitor storage performance metrics (e.g., latency, IOPS, throughput) to stay proactive.

It also allows for capacity planning, ensuring the storage backend can handle the workload demand effectively. As Kubernetes clusters grow, so do the demands on storage. Monitoring allows for:

  • Tracking storage usage trends over time
  • Planning for future capacity requirements
  • Ensuring your storage architecture scales efficiently with the cluster

Without monitoring, scaling storage to meet demand becomes reactive, leading to potential outages or performance degradation.

Avoiding Resource Waste

Unmonitored PVCs and PVs can lead to resource inefficiencies in the cluster. Common scenarios include:

  • Orphaned PVCs or PVs: When namespaces are deleted without properly cleaning up associated PVCs or PVs, these resources remain unused but continue to consume storage.
  • Underutilized Storage: PVs that are partially filled or PVCs requesting more capacity than needed result in inefficient utilization.

By tracking PVC and PV utilization metrics, unused or underutilized resources can be identified and reclaimed, freeing up capacity.

Building off of this knowledge, the "reclaim policy" of a PV (Retain, Recycle, or Delete) determines what happens to the data when a PVC is deleted. Without proper monitoring, PVs with a Retain policy may accumulate and create unnecessary overhead, especially in dynamic environments where there's high churn of pods or namespaces.

Monitoring reclaim policies and the state of released PVs allows you to clean up unused resources efficiently and avoid storage clutter.

Ensuring Application Reliability

At the end of the day, keeping applications running and performant is everything.

Stateful applications, such as databases, rely on persistent storage for their functionality. If PVCs become unbound to a PVC or PVs become inaccessible due to configuration issues or underlying storage failures, the application may experience downtime.

Monitoring PVC binding statuses and storage health metrics ensures applications continue to function seamlessly. Alerting on critical failures, such as the availability state of a PV, allows for quick intervention to keep applications available.

Metric Magic: The Top Indicators for PVCs and PVs

Kubernetes provides an expansive list of metrics to monitor every part of the cluster. Additionally, using kube-state-metrics augments this set of metrics further.

To effectively monitor PVCs and PVs, it's important to look at the following:

Purpose/Use Case Metric Source
PVC Binding Status: Track PVCs that are stuck in the Pending state kube_persistentvolumeclaim_status_phase Kube-state-metrics
PV Capacity Usage: Monitor the used versus available storage in PVs

kubelet_volume_stats_used_bytes and

kubelet_volume_stats_capacity_bytes

Kubernetes stable metrics
PVC Request vs. PV Provisioned Capacity: Identify mismatches between PVC requests and PV allocations kube_persistentvolumeclaim_resource_requests_storage_bytes Kube-state-metrics

Storage Backend Performance: Monitor IOPS, throughput, and latency of the underlying storage system

Storage-specific metrics such as VolumeWriteOps (AWS EBS)

AWS EBS, GCP Persistent Disk metrics, etc.

Volume Mount Errors: Track errors when pods try to mount volumes.

Pod events/logs by running kubectl describe pod <pod-name>

Pod events/logs

War Stories: Common Troubleshooting Scenarios

Let's go through some of the most commonly seen troubleshooting scenarios when it comes to PVCs and PVs.

Scenario 1: PVC Stuck in Pending State

PVCs that are stuck in a Pending state indicate that there is no suitable PV available to meet the claim’s requirements and hence, no suitable PV for it to bind to. This can happen due to several reasons:

  • The requested storage capacity exceeds available PVs
  • Access modes or storage class requirements of the PVC do not match any existing PV
Example

Check the PVC’s details using kubectl describe pvc <pvc-name>:

Name:          example-pvc
Namespace:     default
StorageClass:  standard
Status:        Pending
Volume:
Capacity:      10Gi
Access Modes:  ReadWriteOnce
Events:
  Type     Reason              Age                  From                         Message
  ----     ------              ----                 ----                         -------
  Warning  ProvisioningFailed  2s (x5 over 10s)     persistentvolume-controller  no persistent volumes available for this claim and no storage class is set

What to Look For:

  • Status: Pending indicates the PVC is not yet bound.
  • The event with no persistent volumes available for this claim highlights the lack of a matching PV.
Solution: 

Review the available PVs with kubectl get pv and ensure they meet the PVC’s requirements:

Name:          example-pv
Capacity:      5Gi
Access Modes:  ReadWriteMany
ReclaimPolicy: Retain
Status:        Available
StorageClass:  fast-storage

Observations:

  • The PVC requests 10Gi, but the PV has only 5Gi.
  • The PVC requires ReadWriteOnce, but the PV supports only ReadWriteMany.

If no matching PV exists, create a new PV or adjust the PVC specifications (storage size or access mode)

Scenario 2: Read-Only Volume Mount

This happens when a pod successfully mounts a volume but encounters a "Read-only" file system error during write operations.

Oftentimes, this happens because:

  • There's a mismatch in the access mode between the PVC and PV, where the PVC requires a different access mode than what the PV provides
  • There's a limitation in the underlying storage backend
Example

Check the PVC’s details using kubectl describe pvc <pvc-name>:

Name:          example-pvc
Namespace:     default
StorageClass:  standard
Status:        Bound
Volume:        example-pv
Capacity:      10Gi
Access Modes:  ReadOnlyMany

What to Look For: the PVC is bound to a PV with the ReadOnlyMany access mode, which doesn't allow write operations.

For the sake of this example, let's also view a sample pod YAML extract:

volumes:
  - name: data-volume
    persistentVolumeClaim:
      claimName: example-pvc
volumeMounts:
  - mountPath: /data
    name: data-volume

Solution

Ensure the PV supports ReadWriteOnce or ReadWriteMany and update if necessary. Run: kubectl describe pv <pv-name>:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: standard
  persistentVolumeReclaimPolicy: Retain

Finally, if the above doesn't work, confirm that the storage backend supports the requested access mode

Scenario 3: PV in Released State

This occurs when a PV remains in the Released state even after the associated PVC has been deleted, leaving the PV unusable.

This usually occurs because the PV’s reclaim policy is set to Retain, meaning the data persists after the PVC is deleted, but the PV isn’t automatically cleaned up or reused.

Example

Check the PV’s details using kubectl describe pv <pv-name>:

Name:            example-pv
Capacity:        10Gi
Access Modes:    ReadWriteOnce
Reclaim Policy:  Retain
Status:          Released
Claim:           default/example-pvc

What to look for: 

  • Reclaim Policy: The PV is set to Retain, meaning the data persists, and the PV isn't reused automatically.
  • Status: Released indicates the PV isn’t available for new PVCs.

Solution

Manually delete the PV if it's not required anymore by running: kubectl delete pv example-pv

For future PVs, set the reclaim policy to Delete or Recycle:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: example-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  storageClassName: standard
  persistentVolumeReclaimPolicy: Delete

Scenario 4: Data Not Persisting

This occurs when an application appears to be functioning, but data is not being retained across pod restarts or deployments.

This can occur because of:

  • Misconfigured volume mounts in the pod specification
  • The PVC is not properly bound to the intended PV
Example

Let's take an example pod YAML extract:

spec:
  containers:
    - name: app-container
      volumeMounts:
        - mountPath: /app/data
          name: ephemeral-volume
  volumes:
    - name: ephemeral-volume
      emptyDir: {}

What to Look For: emptyDir Volume Type means the volume is ephemeral and does not retain data beyond the pod’s lifecycle.

Solution

Update the pod to use a PVC-backed volume:

volumes:
  - name: persistent-data-volume
    persistentVolumeClaim:
      claimName: example-pvc
volumeMounts:
  - mountPath: /app/data
    name: persistent-data-volume

And ensure the PVC is bound by running: kubectl describe pvc example-pvc:

Name:          example-pvc
Namespace:     default
Status:        Bound
Volume:        example-pv
Capacity:      10Gi
Access Modes:  ReadWriteOnce

If the above doesn't work, confirm the PVC is bound to the correct PV using kubectl describe pvc <pvc-name>

The Final Takeaway

Monitoring PVCs and PVs in Kubernetes isn’t just about keeping an eye on storage—it’s about ensuring your applications run smoothly, efficiently, and without unnecessary surprises. So, the next time your PVCs or PVs act up, you’ll know exactly where to look and what to do. Observability isn’t just a luxury in Kubernetes — it’s the secret sauce for success.

What’s the trickiest storage issue you’ve faced in Kubernetes and how did you solve it? Let's discuss in the comments! 🚀

0 comments
24 views

Permalink