WebSphere Application Server & Liberty

 View Only

Lessons from the field #34: How much memory is 'free' on Linux?

By Kevin Grigorenko posted Tue October 17, 2023 09:00 AM

  

It seems like a simple question: how much memory (RAM) is "free" on Linux?

On the one hand, there's a simple answer:

Linux aggressively uses free RAM for various caches that, generally, can be quickly freed when programs need more memory, and therefore you should look at the "available" memory statistic instead of the free RAM statistic.

In the following Linux free command example output, use the available column value rather than the free column value. In this case, we can say that roughly 15465MB is "free" rather than 14881MB:

$ free -m
               total        used        free      shared  buff/cache   available
Mem:           15950         267       14881           9         802       15465
Swap:              0           0           0

Similarly, in the following Linux top command example output, use the avail Mem value on the Swap line (although it's not related to swap) instead of the free value on the Mem line. In this case, we can say that roughly 15451MB is "free" rather than 14867MB:

top - 15:02:35 up 1 min,  0 users,  load average: 0.54, 0.17, 0.06
Tasks:   2 total,   1 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15950.9 total,  14867.3 free,    281.3 used,    802.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  15451.8 avail Mem 

This is also shown as MemAvailable in /proc/meminfo:

$ cat /proc/meminfo
MemTotal:       16333764 kB
MemFree:        15241968 kB
MemAvailable:   15840528 kB

On the other hand, answering the question is more complicated. First, we have to understand the two main categories of caches that use free RAM:

  1. File cache (also known as page cache)
  2. Reclaimable slab cache

File cache is pretty straightforward. As Linux reads/writes files, it uses free RAM as a pass-through cache to speed up reads/writes because RAM is much faster than disk. The amount in the filecache may be seen with:

$ grep "^Cached" /proc/meminfo
Cached:           772960 kB

In general, as cached files are written through in RAM, file pages are marked as "dirty" in RAM and there is an asynchronous process to save these changes to disk. This is why there is a warning to safely remove devices such as USB drives because this filecaching behavior is done by all modern operating systems and it means that you should ensure that all dirty pages have been written to disk before removing the disk. On Linux, this can be done explicitly and synchronously by running the command:

$ sudo sync

In general, when programs need memory that is being used by filecache, Linux will push the filecache out of RAM (and write any dirty pages as needed) and then give the memory to the program; however, there is a subtlety. Linux has a general default behavior that is in some contrast to other operating systems that makes Linux sometimes prefer to page out program memory in preference to filecache. This is based on the idea that file I/O speed can sometimes be so important that it might make sense to sacrifice some program memory demands. This behavior is driven by the vm.swappiness Linux kernel parameter. The current value of this may be displayed with the following command and 60 is a common default:

$ sysctl vm.swappiness
vm.swappiness = 60

This option is described in the Linux documentation:

This control is used to define how aggressive the kernel will swap memory pages.  Higher values will increase aggressiveness, lower values decrease the amount of swap.  A value of 0 instructs the kernel not to initiate swap until the amount of free and file-backed pages is less than the high water mark in a zone.

 The default value is 60.

This option is a variable in a complicated algorithm, but the most relevant point is that a value of 0 tells the kernel to avoid paging program pages to disk as much as possible.

One way to check if swapping is occurring is to check if swap usage is greater than 0. In the above top example, the used column in the Swap row shows 0 which means nothing has been swapped out.

For systems with low expected usage of file I/O, set vm.swappiness=0 to reduce the probability of file cache driving program memory swapping. If set to 0, this also means that the memory "available" statistic has a more intuitive meaning. This may be set at runtime:

$ sudo sysctl -w vm.swappiness=0
vm.swappiness = 0

To persist across reboots, add the configuration to a *.conf file in a well-known configuration directory such as /etc/sysctl.d/:

$ sudo sh -c "cat > /etc/sysctl.d/99-swappiness.conf" <<"EOF"
vm.swappiness=0
EOF

For immutable node configurations such as OpenShift, use MachineConfig objects. although such a change is more risky in a Kubernetes environment with heterogeneous workloads that often have file-heavy use cases.

It is also possible to clear the file cache with the following command:

$ sudo sysctl -w vm.drop_caches=1

However, this is generally not recommended except for the case that you are running performance tests and you want to normalize test results as much as possible by clearing filecache before a test starts.

The other common user of RAM as cache is kernel slab. Slab memory allocations are made by the Linux kernel itself for many different functions. The current slab memory usage may be seen with:

$ cat /proc/slabinfo 
slabinfo - version: 2.1
# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
pid_2                192    192    128   32    1 : tunables    0    0    0 : slabdata      6      6      0 [...]

The slabtop program is a convenient program to review slab memory usage.

A subset of kernel slab is "reclaimable" meaning that it can be pushed out if programs need memory in a similar way to filecache. The reclaimable amount may be queried with:

$ grep SReclaimable /proc/meminfo
SReclaimable:      47192 kB

It is possible to clear reclaimable slab cache with the following command:

$ sudo sysctl -w vm.drop_caches=2

As with filecache, this is generally not recommended except for performance test preparation. If there is excessively high reclaimable slab usage, this is likely to be something in an application or shell script that is driving a lot of file or directory accesses (inodes and dentries, respectively) and that should be investigated and eliminated if possible.

There are other types of kernel memory buffers (e.g. TCP socket memory) that may be trimmed under pressure:

$ grep Buffers /proc/meminfo
Buffers:            2056 kB

The value of the "available" statistic is that it avoids you having to manually calculate all of the above intricacies and it quickly tells you the approximate amount of nearly instantly available physical memory for program demands (with the caveat of the impact of vm.swappiness).

Containers

Memory accounting gets more complicated in containers (i.e. cgroups). For example, the memory.stat file of a container shows detailed accounting:

There are various memory statistic files for cgroups that report memory usage but they have the same problems outlined in the beginning of this article of not subtracting file cache, reclaimable slam, buffers, etc., so be careful using these statistics:

  • cgroup v1:
    • memory.usage_in_bytes
    • memory.max_usage_in_bytes
  • cgroup v2:
    • memory.current
    • memory.peak

The following from memory.stat may be subtracted to get a better picture:

  • cgroup v1:
    • cache
    • slab_reclaimable
  • cgroup v2:
    • file
    • slab_reclaimable
    • sock

In addition, vm.swappiness may be set on a per-container basis.

Finally, the industry is largely moving towards memory Pressure Stall Information (PSI) statistics as a mechanism to make memory-based decisions (and these are also available on a per-container basis): https://www.kernel.org/doc/html/latest/accounting/psi.html

#linux#automation-portfolio-specialists-app-platform#RedHatOpenShift#Kubernetes

0 comments
8 views

Permalink