Instana U

 View Only

Kubernetes memory metrics

By Leo Varghese posted 25 days ago

  

For a system to be reliable, monitoring is essential. If you want to understand the memory metrics on a Kubernetes platform and are confused about the different memory metrics, you will find this article useful. The purpose of this article is to discuss the common container memory metrics and how analyzing them can assist you in improving the performance and resilience of your systems.

Memory requests and limits

Memory requests

Memory requests are the minimum amount of memory that is allocated for a container during the scheduling. When a pod is scheduled, kube-scheduler checks the Kubernetes requests to allocate a node that meets the minimum amount of memory required for containers in the pod. If the requested memory amount is higher than the available memory, the pod is not scheduled and remains in “pending” status.

Memory limits

Memory limit is the maximum amount of memory that a container is allowed to consume. A container does not use memory more than the set memory limit. At runtime, Kubernetes checks that the containers in the pod do not exceed the indicated memory limit. When a container tries to consume more memory than the allowed amount of memory, the system kernel terminates the process and displays an out-of-memory (OOM) error message.

Measuring memory usage

Container memory

You can generate the resource usage of a container in a node by using cAdvisor(Container Advisor)

The following cAdvisor metrics are commonly used to measure container memory usage:

  • container_memory_usage_bytes (Total amount of memory usage)

  • container_memory_wss (Working Set Size)

  • container_memory_rss (Resident Set Size)

Node memory

For a Kubernetes cluster, you can collect the node metrics by using the node-exporter service. You can use the following method to calculate node memory usage:

node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Buffers_bytes + node_memory_Cached_bytes)

With this formula, you can measure the memory that applications and systems use on a node. The following parameters are used to measure the memory usage:

MemTotal- Amount of memory available on the node
MemFree- Amount of unused memory on the node
Cached- Amount of memory used for caching files
Buffers- Amount of physical memory used for file system buffers

Container Advisor has a metric called “container_memory_cache”. You can use the “container_memory_cache” metric in the following formula to obtain the result for node memory usage:

node_memory_without_cache - sum(container_memory_usage_bytes - container_memory_cache)

Calculate the memory usage of containers without the cache, then subtract the memory usage of a node without its cache. This calculation provides a precise measurement of memory that the node actively uses outside the Kubernetes layer, which you cannot easily recover.

Resident set size and working set size

When you check the running container and navigate to the folder path `/sys/fs/cgroup/memory`, you obtain all the memory details of the container. In this directory, you can find memory metrics such as usage, limits, cache, and so on.

You can see the following set of files in “/sys/fs/cgroup/memory”.

The memory.stat file contains the metric values required for cAdvisor formulations.

Resident set size (RSS)

RSS is the physical memory in the main memory that doesn’t correspond to anything on disk. RSS includes stacks, heaps, and anonymous memory maps. In cAdvisor code, the memory RSS is defined as the amount of anonymous and swap cache memory (includes transparent hugepages). This value equals the value of total_rss from the memory.status file.

Working set size (WSS)

WSS is a memory that a process needs for its work during a period. In cAdvisor code, the memory WSS is defined as the amount of working set memory, which includes recently accessed memory, dirty memory, and kernel memory. The working set is <= "usage". This value equals the Usage minus total_inactive_file.

You can view the relation between different memory metrics in the following image:

From the above diagram, it can be inferred that:

  • RSS < node_memory_without_cache :
    node_memory_without_cache includes memory not backed up by a file but still not counted as RSS, such as kernel memory.

  • node_memory_without_cache < WSS :
    As WSS includes active files cache.

  • WSS < usage_bytes :
    usage_bytes includes cache. Therefore, usage bytes is always more than WSS.

Which memory metric is important to monitor?

In most cases, it’s confusing to decide which memory metric (RSS or WSS) you need to monitor to ensure system reliability. If you use resource limits on your pods, you need to monitor both to prevent them from being out of memory (OOM).

An OOM situation can occur if the following metric values reach the limits.

  • container_memory_rss

  • container_memory_working_set_bytes

Kubelet eviction

In the Kubernetes environment, the kubelet tracks the memory consumption of every node. If memory usage exceeds a specified eviction threshold, the kubelet removes one or more containers to ensure the stability and functionality of the node.

A container, without a set memory limit, can consume excessive and uncontrolled amounts of page cache. This process can increase the WSS and trigger an unwanted removal. You can set an appropriate memory limit, to restrict the memory usage of the container. When a container reaches its memory threshold, the kernel starts reclaiming the reusable pages from the active list.

Conclusion

In containerized systems, accurately monitoring memory is crucial. Knowing the memory metrics helps you to understand the real memory needs and look for potential issues. Using these metrics carefully enhances system reliability and ensures robust performance.

Reference


#Featured-area-2
#Featured-area-2-home

Permalink