Why Pod Size Matters
When running workloads in Kubernetes, defining Pod size (CPU and memory requests/limits) isn’t just a good practice — it’s essential for performance, stability, and cost control.
Getting pod sizing wrong can be just as bad — if not worse — than forgetting to set it at all.
-
Over-Provisioning Wastes Money
If you request way more CPU or memory than your pod actually needs, Kubernetes will reserve that capacity for you — meaning other workloads can’t use it. In cloud environments, that idle reservation still costs you money.
-
Under-Provisioning Hurts Performance
If your pod is starved for CPU or memory, it will suffer slow response times, frequent throttling, or even get OOMKilled (terminated because it ran out of memory).
-
Cluster Scheduling Gets Skewed
Oversized pods make the scheduler think the cluster is full, while undersized pods can cause resource contention when they use more than they asked for. Both scenarios lead to inefficient node utilization.
-
Operational Headaches Multiply
Incorrect sizing leads to unpredictable scaling behavior in HPA (Horizontal Pod Autoscaler), more noisy alerts for SRE teams, and harder capacity planning.
Right-sizing ensures workloads run reliably, nodes are used efficiently, and costs stay under control — without starving other apps or overloading the cluster.
In this blog post, we'll discuss a method to determine the right size of a Pod using Kyverno.
Introducing Kyverno Policy Engine
Kyverno is a Kubernetes-native policy engine that helps you manage and secure workloads through declarative policies, written as YAML — the same way you define deployments, services, and other cluster resources.
At its core, Kyverno watches for changes in Kubernetes resources (like Pods, Deployments, ConfigMaps) and applies your policies to either:
-
Validate — block or warn if something doesn’t match your rules.
-
Mutate — add or change fields automatically to enforce standards.
-
Generate — create new resources or configurations as part of an admission workflow.
Kyverno supports three main kinds of policies:
-
Validation Policies
-
Purpose: Enforce rules by blocking (or warning about) non-compliant resources.
-
Example: Reject Pods without CPU/memory requests or limits.
-
How it works: Uses validate
rules with conditions and patterns.
-
Use case: Stop over-provisioned or under-provisioned workloads before they enter the cluster.
-
Mutation Policies
-
Purpose: Automatically modify incoming resources to meet your standards.
-
Example: Add default CPU/memory requests to Pods that don’t specify them.
-
How it works: Uses mutate
rules with patchStrategicMerge
or overlay
to inject or update fields.
-
Use case: Ensure workloads always have reasonable starting values without requiring developers to set them.
-
Generation Policies
-
Purpose: Automatically create additional resources when a triggering resource appears.
-
Example: Generate a Vertical Pod Autoscaler (VPA) object whenever a new Deployment is created.
-
How it works: Uses generate
rules that watch for resource creation and spawn new objects.
-
Use case: Automatically attach scaling helpers or config maps to workloads for better optimization.
Kyverno Policies for Pod Right-Sizing
For resource optimization, we will mainly focus on Validation and Generation policies. The solution leverages following two Kyverno policies:
-
Generate VPA — Auto-create VerticalPodAutoscalers
Kyverno’s generation policies allow you to automatically create Kubernetes objects when related resources appear—perfect for adding Vertical Pod Autoscalers (VPAs) to your workloads without manual steps. These policies are part of Kyverno’s core functionality.
A widely shared example is a ClusterPolicy that targets Deployments—excluding system namespaces—and generates a corresponding VPA resource per workload. This enables VPA to begin monitoring and recommending optimal resource sizes for each pod.
-
Check Resources — Ensure Requests/Limits Evaluate to Sane Values
Equally important is the validation policy, which Kyverno uses to enforce resource configurations. These policies run during admission, comparing specs against expectations and rejecting misconfigured pods.
-
Checks actual resource settings
-
Compares them to VPA recommendations
-
Flags pods that are over- or under-provisioned, helping teams take corrective action.
Kyverno Policy Reports
When you set Kyverno to generate VerticalPodAutoscaler (VPA) objects in recommendation-only mode, the VPA’s Recommender component tracks usage and suggests resource allocations—without applying them. These recommendations, combined with Kyverno’s policy engine, feed into a structured feedback loop.
- VPA Recommender in Recommendation-Only Mode. Instead of making automatic changes, VPA runs in “Off” or “Recommendation” mode—this allows it to observe pod behavior over time and emit recommendations on CPU and memory requests without disrupting workloads.
Kyverno’s Generate VPA policy automates the creation of VPAs for deployments, statefulsets, etc.—ensuring consistent coverage.
- Validation Against VPA Recommendations. A separate Check Resource (validation) policy compares actual pod resource settings with the VPA’s upper and lower bounds. This policy allows a 20% margin to reduce noise, only flagging pods that are meaningfully over- or under-provisioned.
- Kyverno Policy Reports Capture VPA Insights. These checks generate structured PolicyReports (or ClusterPolicyReports)—Kyverno’s CRDs that log pass/fail results for each rule evaluation. Reports store real-time compliance data and are accessible via native Kubernetes CLI tools or dashboards.
Now that we understand different components of Kyverno and how we can leverage Kyverno for resource optimization, let us look at the solution architecture for Pod rightsizing.
Context Diagram
The reference architecture is shown below. The solution consists of following components:
- VPA Recommender. Understands traffic patterns and resource utilization patterns. It observes these for a Pod and provides recommendations for optimal resource utilization.
- Metric Server. VPA uses metric server to observe resource utilization to provide recommendations.
- VPA Configuration. Required to be configured for each resource under observation so that VPA recommendations can be tagged with the resource.
- Policy Report. To generate report if any of the resource is not passing the policies, based on which can take action to increase or reduce resources assigned to the Pod.
The solution requires two policies:
- Generate VPA. Whenever a workload is created in Kubernetes, e.g., Deployment, or DaemonSet, etc, an event is generated and Kyverno responds by generating a VPA configuration for the workload.
- Check resources. This policy continuously checks the resource utilization of a workload against its VPA recommendations. Based on that it generates reports that shows whether the resource is over provisioned or under provisioned. It also provides recommendations to rightsize the workload by increasing or decreasing resources allocated to it.
Implementation
Let us now see a working example of how to stitch the solution together using above components.
Pre-requisites
- Kubernetes Cluster. We'll use IBM Cloud IKS service to provision K8s cluster. Login to K8s cluster using
ibmcloud
cli.
ibmcloud login --sso
ibmcloud ks clusters
ibmcloud target -g <resource group>
ibmcloud ks cluster config --cluster <cluster>
- Metric Server. The metrics service is required. It comes installed by default with IBM IKS. You can check if one is installed using:
kubectl get deployment metrics-server -n kube-system
If installed and running, you should see something like:
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 2/2 2 2 5y37d
- VPA.
The Kubernetes VerticalPodAutoscaler has several components:
- Recommender
- Updater
- Admission Plugin
For this solution, we only need the VPA Recommender
. You can install it by executing:
kubectl apply -f https://raw.githubusercontent.com/nirmata/demo-resource-optimizer/main/config/vpa/install-vpa-recommender.yaml
- Kyverno. Execute the following commands to install Kyverno:
helm repo add kyverno https://kyverno.github.io/kyverno/
helm repo update
helm install kyverno kyverno/kyverno -n kyverno --create-namespace
-
Configure Kyverno RBAC permissions.
To generate a VerticalPodAutoscale, Kyverno needs to be given additional permissions for VPA resources.
Execute the following command to configure Kyverno permissions:
kubectl apply -f https://raw.githubusercontent.com/nirmata/demo-resource-optimizer/main/config/kyverno/kyverno-vpa-rbac.yaml
kubectl apply -f https://raw.githubusercontent.com/nirmata/demo-resource-optimizer/main/config/kyverno/kyverno-vpa-rolebinding.yaml
Install Kyverno Policies
We want to generate VPA configurations for an existing deployments in a specific namespace. To install generate VPA policy, apply the following yaml.
---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: generate-vpa
annotations:
policies.kyverno.io/title: Generate VPA
policies.kyverno.io/category: Resource Optimization
policies.kyverno.io/severity: medium
policies.kyverno.io/description: >-
This policy generates Vertical Pod Autoscalers (VPAs)
for Deployment and StatefulSet resources.
spec:
# uncomment the following line to generate VPAs for existing resources
generateExisting: true
rules:
- name: create-for-podcontrollers
match:
any:
- resources:
kinds:
- Deployment
names:
- time-series-query
- time-series-writer
namespaces:
- si-dev-001b
generate:
synchronize: true
kind: VerticalPodAutoscaler
apiVersion: autoscaling.k8s.io/v1
name: "{{request.object.metadata.name}}-kyverno"
namespace: "{{request.object.metadata.namespace}}"
data:
spec:
targetRef:
apiVersion: "{{request.object.apiVersion}}"
kind: "{{request.object.kind}}"
name: "{{request.object.metadata.name}}"
updatePolicy:
updateMode: "Off"
The second policy is Check Resources. Install it by applying following yaml. We are targeting the same workloads as in Generate VPA policy.
---
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: check-resources
annotations:
policies.kyverno.io/title: Check Resources
policies.kyverno.io/category: Resource Optimization
policies.kyverno.io/severity: medium
policies.kyverno.io/description: >-
This policy checks if the resources requested by a pod controller (Deployment or StatefulSet)
are within the bounds recommended by the Vertical Pod Autoscaler (VPA).
spec:
validationFailureAction: Audit
background: true
admission: false
rules:
- name: memory
match: &matchDef
any:
- resources:
kinds:
- Deployment
names:
- time-series-query
- time-series-writer
namespaces:
- si-dev-001b
exclude:
any:
- resources:
selector:
matchLabels:
nirmata.io/resource-optimizer: "false"
namespaces:
- kube-system
context: &ruleContextDef
- name: vpa
# currently this will return errors if the VPA is not available
# so we fetch all VPAs and filter by the expected name
# see: https://github.com/kyverno/kyverno/issues/9723
# Note that this logic assumes that the VPA has the same name as the
# pod controller (Deployment or StatefulSet) with the suffix "-kyverno".
apiCall:
urlPath: "/apis/autoscaling.k8s.io/v1/namespaces/{{request.namespace}}/verticalpodautoscalers/"
jmesPath: "items[?metadata.name == '{{request.object.metadata.name}}-kyverno'] | @[0] || `{}`"
preconditions: &pre
all:
- key: "{{ vpa.status || `{}` }}"
operator: NotEquals
value: {}
- key: "{{ time_since('', '{{vpa.metadata.creationTimestamp}}', '') }}"
operator: GreaterThan
value: 0m # Set to >24h for production
validate:
foreach:
- list: vpa.status.recommendation.containerRecommendations
context:
- name: ctnr
variable: &ctnrVariable
value: "{{ request.object.spec.template.spec.containers[?name == '{{element.containerName}}'] | @[0] }}"
- name: memRequest
variable:
value: "{{ ctnr.resources.requests.memory || `0` }}"
deny:
conditions:
any:
- key: "{{ multiply(memRequest, `0.80`) }}"
operator: GreaterThan
value: "{{ element.upperBound.memory }}"
message: "overprovisioned resources: reduce memory.request from {{memRequest}} to {{ divide(element.target.memory, `1048576`) }}Mi"
- key: "{{ multiply(memRequest, `1.20`) }}"
operator: LessThan
value: "{{ element.lowerBound.memory }}"
message: "underprovisioned resources: increase memory.request from {{memRequest}} to {{ divide(element.target.memory, `1048576`) }}Mi"
# Using multiple rules to report each violation separately.
# Otherwise, processing stops at the first violation
# See: https://github.com/kyverno/kyverno/issues/8792
- name: cpu
match: *matchDef
context: *ruleContextDef
preconditions: *pre
validate:
foreach:
- list: vpa.status.recommendation.containerRecommendations
context:
- name: ctnr
variable: *ctnrVariable
- name: cpuRequest
variable:
value: "{{ ctnr.resources.requests.cpu || `0` }}"
deny:
conditions:
any:
- key: "{{ multiply(cpuRequest, `0.80`) }}"
operator: GreaterThan
value: "{{ element.upperBound.cpu }}"
message: "overprovisioned resources: reduce cpu.request from {{cpuRequest}} to {{element.target.cpu}}"
- key: "{{ multiply(cpuRequest, `1.20`) }}"
operator: LessThan
value: "{{ element.lowerBound.cpu }}"
message: "underprovisioned resources: increase cpu.request from {{cpuRequest}} to over {{element.target.cpu}}"
Let's verify if all components are in place for us to start generating resource recommendations for our workloads.
% kubectl get deployment metrics-server -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 2/2 2 2 5y37d
% kubectl get po -n vpa
NAME READY STATUS RESTARTS AGE
vpa-recommender-6664f7678-b5xbm 1/1 Running 0 40d
% kubectl get po -n kyverno
NAME READY STATUS RESTARTS AGE
kyverno-admission-controller-787c6f5c89-x5gx6 1/1 Running 5 (9d ago) 33d
kyverno-background-controller-686df748c7-nt8d6 1/1 Running 6 (8d ago) 33d
kyverno-cleanup-controller-54cf7bddd6-lrxjb 1/1 Running 7 (8d ago) 33d
kyverno-reports-controller-69f56cf7bb-96mn4 1/1 Running 5 (9d ago) 33d
Let's check our targeted workloads.
% kubectl get po -n si-dev-001b --selector='app in (time-series-writer,time-series-query)'
NAME READY STATUS RESTARTS AGE
time-series-query-55c7c48494-kncq9 1/1 Running 0 10h
time-series-writer-848759f458-bqrsr 1/1 Running 0 12h
VPA configurations will be automatically generated as per the Kyverno Generate-VPA policy that we installed earlier. VPA observes the metrics from the Pods using metrics server and provides recommendations.
% kubectl describe vpa -n si-dev-001b
Name: time-series-query-kyverno
Namespace: si-dev-001b
Labels: app.kubernetes.io/managed-by=kyverno
generate.kyverno.io/policy-name=generate-vpa
generate.kyverno.io/policy-namespace=
generate.kyverno.io/rule-name=create-for-podcontrollers
generate.kyverno.io/trigger-group=apps
generate.kyverno.io/trigger-kind=Deployment
generate.kyverno.io/trigger-namespace=si-dev-001b
generate.kyverno.io/trigger-uid=3a9e4a43-57df-4fdd-b4d2-53725f190bd8
generate.kyverno.io/trigger-version=v1
Annotations: <none>
API Version: autoscaling.k8s.io/v1
Kind: VerticalPodAutoscaler
Metadata:
Creation Timestamp: 2025-07-10T06:20:55Z
Generation: 1
Resource Version: 1116201938
UID: 1e473ff8-4f70-4e29-854d-ec6d80731511
Spec:
Target Ref:
API Version: apps/v1
Kind: Deployment
Name: time-series-query
Update Policy:
Update Mode: Off
Status:
Conditions:
Last Transition Time: 2025-07-10T06:21:14Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: time-series-query
Lower Bound:
Cpu: 34m
Memory: 5526734463
Target:
Cpu: 49m
Memory: 5815202783
Uncapped Target:
Cpu: 49m
Memory: 5815202783
Upper Bound:
Cpu: 66m
Memory: 6131652484
Events: <none>
Name: time-series-writer-kyverno
Namespace: si-dev-001b
Labels: app.kubernetes.io/managed-by=kyverno
generate.kyverno.io/policy-name=generate-vpa
generate.kyverno.io/policy-namespace=
generate.kyverno.io/rule-name=create-for-podcontrollers
generate.kyverno.io/trigger-group=apps
generate.kyverno.io/trigger-kind=Deployment
generate.kyverno.io/trigger-namespace=si-dev-001b
generate.kyverno.io/trigger-uid=419ec486-5c05-4535-a14c-91507c01cb13
generate.kyverno.io/trigger-version=v1
Annotations: <none>
API Version: autoscaling.k8s.io/v1
Kind: VerticalPodAutoscaler
Metadata:
Creation Timestamp: 2025-07-10T06:20:56Z
Generation: 1
Resource Version: 1116201939
UID: cef38e2c-c4d1-4110-ab66-9494f27359aa
Spec:
Target Ref:
API Version: apps/v1
Kind: Deployment
Name: time-series-writer
Update Policy:
Update Mode: Off
Status:
Conditions:
Last Transition Time: 2025-07-10T06:21:14Z
Status: True
Type: RecommendationProvided
Recommendation:
Container Recommendations:
Container Name: time-series-writer
Lower Bound:
Cpu: 1642m
Memory: 10108307283
Target:
Cpu: 4280m
Memory: 10626315661
Uncapped Target:
Cpu: 4280m
Memory: 10626315661
Upper Bound:
Cpu: 5810m
Memory: 11173456739
Events: <none>
Lets look at the policy report generated by the Check Resources policy for our targeted workloads.
% kubectl get polr -n si-dev-001b
NAME KIND NAME PASS FAIL WARN ERROR SKIP AGE
3a9e4a43-57df-4fdd-b4d2-53725f190bd8 Deployment time-series-query 2 1 0 0 0 32d
419ec486-5c05-4535-a14c-91507c01cb13 Deployment time-series-writer 2 1 0 0 0 32d
The report shows a failure. Let's check the report.
% kubectl get polr 3a9e4a43-57df-4fdd-b4d2-53725f190bd8 -n si-dev-001b -o yaml
apiVersion: wgpolicyk8s.io/v1alpha2
kind: PolicyReport
metadata:
creationTimestamp: "2025-07-10T06:21:05Z"
generation: 1187
labels:
app.kubernetes.io/managed-by: kyverno
name: 3a9e4a43-57df-4fdd-b4d2-53725f190bd8
namespace: si-dev-001b
ownerReferences:
- apiVersion: apps/v1
kind: Deployment
name: time-series-query
uid: 3a9e4a43-57df-4fdd-b4d2-53725f190bd8
resourceVersion: "1116190145"
uid: 6852b84e-a0cd-49c0-a89d-0667d7aabb73
results:
- category: Resource Optimization
message: 'validation failure: overprovisioned resources: reduce cpu.request from
2 to 49m'
policy: check-resources
properties:
process: background scan
result: fail
rule: cpu
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754958366
- category: Resource Optimization
message: rule passed
policy: check-resources
properties:
process: background scan
result: pass
rule: memory
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754958366
- category: Resource Optimization
policy: generate-vpa
properties:
process: background scan
result: pass
rule: create-for-podcontrollers
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754617154
scope:
apiVersion: apps/v1
kind: Deployment
name: time-series-query
namespace: si-dev-001b
uid: 3a9e4a43-57df-4fdd-b4d2-53725f190bd8
summary:
error: 0
fail: 1
pass: 2
skip: 0
warn: 0
From the report, it looks like we over-provisioned CPU. We need to reduce the CPU assigned to this Pod.
- category: Resource Optimization
message: 'validation failure: overprovisioned resources: reduce cpu.request from
2 to 49m'
policy: check-resources
Let's check the failure in the other workload.
% kubectl get polr 419ec486-5c05-4535-a14c-91507c01cb13 -n si-dev-001b -o yaml
apiVersion: wgpolicyk8s.io/v1alpha2
kind: PolicyReport
metadata:
creationTimestamp: "2025-07-10T06:21:06Z"
generation: 1185
labels:
app.kubernetes.io/managed-by: kyverno
name: 419ec486-5c05-4535-a14c-91507c01cb13
namespace: si-dev-001b
ownerReferences:
- apiVersion: apps/v1
kind: Deployment
name: time-series-writer
uid: 419ec486-5c05-4535-a14c-91507c01cb13
resourceVersion: "1116190144"
uid: 7f635c31-1dcb-4a68-8dd1-cec87ac18000
results:
- category: Resource Optimization
message: rule passed
policy: check-resources
properties:
process: background scan
result: pass
rule: cpu
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754958366
- category: Resource Optimization
message: 'validation failure: underprovisioned resources: increase memory.request
from 5Gi to 10134Mi'
policy: check-resources
properties:
process: background scan
result: fail
rule: memory
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754958366
- category: Resource Optimization
policy: generate-vpa
properties:
process: background scan
result: pass
rule: create-for-podcontrollers
scored: true
severity: medium
source: kyverno
timestamp:
nanos: 0
seconds: 1754617154
scope:
apiVersion: apps/v1
kind: Deployment
name: time-series-writer
namespace: si-dev-001b
uid: 419ec486-5c05-4535-a14c-91507c01cb13
summary:
error: 0
fail: 1
pass: 2
skip: 0
warn: 0
In this case, we under-provisioned memory. We need to increase memory assigned to this Pod as per the recommendations.
- category: Resource Optimization
message: 'validation failure: underprovisioned resources: increase memory.request
from 5Gi to 10134Mi'
policy: check-resources
Step-by-Step Approach to Pod Rightsizing with Kyverno
-
Initialize
-
Deploy and Load Test
-
Evaluate Policy Reports
-
Adjust Resources
-
Iterate
Conclusion
By following this iterative, policy-driven approach, you ensure workloads are sized accurately for performance and cost-efficiency. Kyverno’s validation and generation policies streamline the process, enabling continuous optimization without guesswork.