WebSphere Application Server & Liberty

 View Only

Lessons from the Field #23: Gathering Linux core dumps

By Kevin Grigorenko posted 17 days ago

  
Linux core dumps are files produced by the Linux operating system that dump out all of the memory of a process at a point in time. They are most often used for OutOfMemoryErrors and crash analysis, but have other uses as well.

Gathering good core dumps is often a problem so this post will detail various issues and solutions for gathering good core dumps on Linux, along with various other important things to know.

For OutOfMemoryErrors (OOMEs), the key value of core dumps over other diagnostics such as PHD files is that PHDs do not provide full and accurate garbage collection root information, nor details on thread stack locals. This means that PHDs can only find the proximate cause of an OOME in about 80% of cases whereas core dumps do not suffer from these limitations and therefore can find the proximate cause of an OOME in about 99% of cases.

What to know before gathering

Before you gather a core dump, your team must review certain key implications of core dumps:

  1. Core dumps contain all memory contents of the process. Therefore, if, for example, a core dump is produced while a user is performing a sensitive operation, then there may be sensitive and/or personal information in memory about that user that may be seen by anyone with access to the core dump. You should review the implications of this with your security team. If needed, there are various steps that can be taken to secure core dumps such as ensuring file permissions are restricted, moving the dump to a more locked down machine, and/or encrypting the core dump. If the core dump cannot be transferred to IBM Support, you can install the free tooling to analyze core dumps yourself (for Java programs) and we can guide you through the analysis iteratively or through a screen share (if allowed).
  2. Core dumps may be big. The size of the core dump will be approximately the size of the virtual address space size of the process (e.g. VSZ in ps). For Java processes, most often, this will be approximately -Xmx plus 512MB to 1GB or so. However, since large chunks of a virtual address space are often unused, the dump ends up being filled with many sections of zeros, so it compresses really well (up to 50% or more). Just remember that the file is big so compression can still take dozens of seconds or more; so, if you compress in production, that will eat up a whole CPU for up to dozens of seconds. If needed, you can transfer the dump to a non-production machine and compress it there.
  3. Core dumps may take a lot of time to produce. While a core dump is being produced, the process is completely paused. For larger core dumps, this may be up to dozens of seconds. If the dump is due to a crash, then this time doesn't really matter since the process will be killed right after the core dump is produced. However, in the case of OutOfMemoryErrors or manually taken core dumps, this should be evaluated for the impact on your users. In the case of OutOfMemoryErrors, in most cases, the process has already been garbage collection thrashing for minutes or hours before the JVM finally gave up and threw the OutOfMemoryError, so dozens of seconds doesn't matter. However, in the case of non-OutOfMemoryError, manually collected dumps, the JVM will be paused for some time. There are ways to improve this like a tmpfs disk that writes the dump to memory only, if you have available RAM for that. Otherwise, having plenty of free RAM also helps because Linux will write the core dump to free RAM file cache first and then asynchronously write it out to disk, allowing the process to be unpaused faster. Otherwise, faster disks are the main way to improve performance of writing core dumps.
Finally, another thing to note is that Linux core dumps are often confused with IBM Java and Semeru javacore.txt thread dump files, since they both have "core" in their name. The latter are just thread dump text files and don't have any of the concerns above: they are small, generally don't contain sensitive information, and don't take much time to produce.

Producing good core dumps

Core dumps for Java processes are produced by the Linux kernel. Therefore, you must first review your Linux configuration to ensure good core dumps are produced. There are two critical configurations:

  1. Core and file ulimits on the process
  2. Linux kernel core_pattern configuration
As a side note, for IBM Java and Semeru, when a core dump is produced by the JVM such as for an OutOfMemoryError or other manual mechanism, since Linux doesn't provide an API to create a core dump, the JVM forks itself, which makes a copy of the virtual memory of the process into another process, and then crashes that forked process to cause Linux to create a core dump of it. This is why if you load a manually produced core dump or a core dump from an OOM into a native Linux core debugger like gdb that it only shows a single thread, since threads aren't forked; however, when loading such core dumps into tooling such as the Eclipse Memory Analyzer Tool with the IBM DTFJ extension, it has the smarts to figure out what all the other threads were doing by looking at the memory in the core dump.

Core and file ulimits on the process

Linux sets limits on what a process can do through its ulimit mechanism. These are most often set by system-wide configuration or on a per-process basis during the launch of the process. The two most commonly relevant ulimits for core dumps are the core and file ulimits.

The core ulimit sets the maximum size of core dump that can be produced. If this is 0, then no core dump is produced. If it's greater than 0 but less than the virtual size of the process, then the core dump will be truncated. Since thread information is usually at the top of the virtual address space (which translates into the end of the core dump file), if a core dump is truncated, critical information is lost which usually makes core dump analysis extremely limited.

The file ulimit sets the maximum file size of any file produced by the process. Since a core file is just a normal file, then this ulimit applies in the same way the core ulimit applies: If it's set to 0, then no core file will be produced; if it's less than the virtual size of the process, then the core file will be truncated.

Another aspect of ulimits is that there are soft and hard ulimits. The soft ulimit is the starting limit but a process may increase its ulimit up to the hard ulimit. IBM Java and Semeru have code that detects if the core soft ulimit is less than the core hard ulimit and attempts to increase the ulimit up to the hard ulimit.

In general, when ensuring ulimits are configured properly for core dumps, it's easiest to simply set both soft and hard ulimits to unlimited.

You may check ulimits of a running process by outputting the limits file for the process ID and search for the "Max file size" (file) and "Max core file size" (core) ulimits:

# cat /proc/$PID/limits
Limit                     Soft Limit           Hard Limit           Units     
Max file size             unlimited            unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     ​

For IBM Java and Semeru, a javacore.txt thread dump also has this information in the RLIMIT_CORE (core) and RLIMIT_FSIZE (file) ulimit lines:

1CIUSERLIMITS  User Limits (in bytes except for NOFILE and NPROC)
NULL           ------------------------------------------------------------------------
NULL           type                            soft limit           hard limit
2CIUSERLIMIT   RLIMIT_CORE                      unlimited            unlimited
2CIUSERLIMIT   RLIMIT_FSIZE                     unlimited            unlimited​

There are various ways to set global ulimit values. This will depend on your Linux distribution, so check with your Linux administrators. As one example, for RHEL, these may be set in /etc/security/limits.conf, then exit the shell, start a new shell, and then re-launch the process:

*               soft       core             unlimited
*               hard       core             unlimited
*               soft       fsize            unlimited
*               hsrd       fsize            unlimited​

Ulimits may also be set explicitly in scripts that start a process through commands such as ulimit -c unlimited (core) and ulimit -f unlimited (file).

Linux kernel core_pattern configuration

The kernel.core_pattern configuration tells the Linux kernel where to write the core dump. This is a global configuration that may be output through the core_pattern file. Administrator privileges are not needed to read this configuration:

$ cat /proc/sys/kernel/core_pattern
core

If core_pattern starts with neither / nor |, such as "core" in the above example, then the core dump will be written to the current working directory of the process. This directory location may be checked for a running process by searching for the cwd symbolic link:

# ls -l /proc/$PID/ | grep cwd
lrwxrwxrwx.  1 wsadmin wsadmin 0 Nov 14 18:58 cwd -> /opt/ol/wlp/output/defaultServer/


If core_pattern is specifically the word core (as in the above example), then if the core dump is produced through the core dump mechanism of IBM Java or Semeru (such as an OutOfMemoryError or other manual mechanism), then the JVM will post-process the name of the dump file based on -Xdump configuration. By default, this will add a timestamp, process ID, sequence number, and .dmp file extension. The JVM will also append some information into the core dump about loaded native libraries which is sometimes, though rarely, useful for some advanced native memory investigations. However, neither the file renaming nor additional information are usually needed.

If core_pattern starts with a /, then the core dump will be attempted to be written to the specified directory. Ensure that directory exists.

If core_pattern starts with a |, then the core dump will be sent to the specified program and you must review the manual for that program to ensure it is configured correctly so as not to truncate a core dump and understand where the program will write the core dump.

Lately, the most common core dump processing program is systemd-coredump (e.g. |/usr/lib/systemd/systemd-coredump).

There have been various recommendations to disable such core dump processing programs and revert core_pattern to core; however, the industry is slowly but surely moving towards core dump processing programs and many administrators do not want to set core_pattern back to core. One key reason is that core dump processing programs manage the maximum disk space that core dumps may use, avoiding situations where core dumps fill up a disk and cause application issues.

Therefore, in general, if a core dump processing program such as systemd-coredump is configured, it is not advisable to unset this configuration. Instead, that program should simply be configured properly to make sure the core dump processing program creates the core dump and doesn't truncate it.

Unfortunately, truncation is a common problem with current versions of systemd-coredump because it defaults to a maximum core file size limit of 2GB for 64-bit programs, and Java programs often use more than 2GB of virtual memory. (After version v250 of systemd-coredump, the default maximum core file size limit has been increased to 32GB for 64-bit programs.)

To properly configure systemd-coredump, for example, edit /etc/systemd/coredump.conf and update the defaults; for example, start with something like the following but change based on your available disk space:

ProcessSizeMax=100G
ExternalSizeMax=100G
MaxUse=500G
Compress=no


As of this writing, there is no way to specify an unlimited value for ProcessSizeMax, ExternalSizeMax or MaxUse which is why large values are used above. Compression is also disabled to reduce core dump production times.

Then run:

systemctl daemon-reload


Note that a Java process does not need to be restarted to take advantage of the new systemd-coredump settings.

When a core dump is handled by systemd-coredump, by default, it will be written to /var/lib/systemd/coredump/.

You may also use the coredumpctl utility to list handled core dumps and then use the info sub-command to get details; particularly, the Storage location:

# coredumpctl
TIME                             PID        UID   GID SIG COREFILE  EXE
Wed 2022-08-03 18:52:29 UTC  2923161 1000650000     0  11 present   /opt/java/openjdk/jre/bin/java
# coredumpctl info 2923161 | grep Storage:
    Storage: /var/lib/systemd/coredump/core.kernel-command-.1000650000.08b9e28f46b348f3aabdffc6896838e0.2923161.1659552745000000.lz4


In containerized environments, if a core dump processing program core_pattern is used, the core dump is written to the worker node, not the container, and must be retrieved from the worker node; however, if a core_pattern is set to core or an absolute path, then it is written within the container.

Manually requesting core dumps

Core dumps may be manually requested for IBM Java and Semeru. This may be useful to investigate memory leaks or gather diagnostics through -Xdump, -Xtrace, and other situations.

The easiest way to manually create core dumps on Linux is through the gcore utility shipped as part of the gdb package. Unfortunately, this is not reliable for IBM Java and Semeru. If you happen to get the core dump during a sensitive operation such as a garbage collection, then it's likely the core dump will be essentially useless for Java heap analysis. The reason is that during such sensitive operations, the JVM is changing fundamental aspects of Java objects such as object pointers. Even if you get lucky and don't get the core dump during such a sensitive operation, threads may be in non-fully walkable states so some thread local information will be lost.

Therefore, to manually gather a core dump for IBM Java or Semeru, it is recommended instead to use the core dump mechanism that the JVM provides along with the request=exclusive+prepwalk options. This ensures that the core dump is not taken during a sensitive operation such as a garbage collection, and ensures threads are in a properly walkable state.

There are various ways to gather such dumps, but the most common are to use the Jcmd Dump.system function:

For IBM Java >= 8.0.6.25:

java -Xbootclasspath/a:$JAVA_HOME/lib/tools.jar openj9.tools.attach.diagnostics.tools.Jcmd $PID Dump.system


For Semeru:

$JAVA_HOME/bin/jcmd $PID Dump.system

Summary

In conclusion, Linux core dumps are a critical diagnostic tool for Java programs, particularly IBM Java and Semeru. Beyond the classic use case of figuring out the cause of a crash, core dumps may be used for more thorough OutOfMemoryError investigations, memory leak investigations, and other diagnostic situations.

To properly use core dumps, core and file ulimits must be set properly (most often to unlimited), and core_pattern must be configured properly. If core_pattern pipes to a core processing program, it's generally better to properly configure that program rather than reverting core_pattern back to core. For the most common core processing program of systemd-coredump, this means updating configuration in /etc/systemd/coredump.conf and then finding the core dump in /var/lib/systemd/coredump/.


#automation-portfolio-specialists-app-platform
#Java
#WebSphere
#WebSphereApplicationServer(WAS)
#WebSphereLiberty

Permalink