Author: @Sorna Sarathi N
Co-authored by: @GIREESH PUNATHIL
Java powers a vast range of applications, from mobile applications to enterprise level systems. To ensure workloads are running efficiently and are producing the intended outcomes, monitoring is essential. In this blog, we explain essential aspects of Java application monitoring.
Monitoring is an integral part of large-scale applications. It involves observing and tracking various aspects of systems, networks, or applications to ensure they operate effectively and optimally. Proper monitoring systems help to identify issues, optimize performance, and ensure reliability.
Monitoring basics
A typical monitoring system comprises three parts:
- Monitoring agent: The agent directly attaches itself to the application or JVM and collects traces, logs, and dumps that are required for visualizing the application progress. We will discuss different mechanisms to capture the metrics.
- Monitoring data processor: The data processor assembles all the metrics data collected by the agent and generates high level insights such as aggregations, summaries, patterns, and trends.
- Monitoring data visualizer: The visualizer component renders the insights generated by the data processor into understandable graphical diagrams like bar chart, histograms, etc. to ease the viewer’s analysis.
With these three components, we can effectively monitor our Java applications. But what are we monitoring?
Java monitoring versus JVM monitoring
When people say Java Monitoring, the implied meaning is to monitor the Java platform, or the Java Virtual Machine (JVM). The JVM bridges the gap between the Java code and the underlying low level execution sequence across different platforms. The JVM intricately manages program execution and the resources required for the execution. In this way, the JVM is at the core of Java applications by translating Java byte code into actions on the hosting platform. What this means is that the execution performance, as well as the resource utilization of the JVM, can significantly affect the functioning of an application. That is why monitoring the JVM is an integral part of a Java application monitoring strategy. The collected metrics from the JVM are generally composed into different Profiles. They typically refer to specific sets or categories of metrics that are collected and analyzed from the JVM. Each profile provides a focused view of a particular aspect of a JVM’s operations.
Typical JVM monitoring involves these metrics:
- CPU profile - Provides CPU resource utilization and helps to identify CPU-intensive tasks.
- Memory profile - Focuses on memory usage patterns (java heap + native heap) and garbage collection statistics.
- Thread profile - Provides information about threads and their activities, helps in diagnosing thread related issues.
- I/O (disc or n/w) profile - Measures Input/Output operations, observes I/O performance and identifies bottlenecks.
- Class profile - Monitors Class loading dynamics and dependencies in java applications.
- Environment profile - Provides context with Operating System (OS), JVM and hardware details influencing application performance.
The image below showcases the summary page from IBM’s Monitoring and Diagnostic Tool - Health Center, highlighting various JVM profiles captured during monitoring, as an example.
Events and Event Sources
The profiles listed in the JVM metrics list above are built by the Data processor component from data points/information collected by the agent component from a running virtual machine. The agent gathers this information in two different ways:
- Synchronous metrics: Information collected by the agent inspecting the internal state of the JVM (e.g. current memory usage). These will be collected through a direct all into the VM.
- Asynchronous metrics: Information supplied to the agent by the virtual machine when certain events occur in the process (e.g. a class load event). These will be obtained through callbacks from the VM.
Let’s get ourselves into the shoes of a user. When the system/application encounters issues like thread leaks and memory leaks, our immediate step will be investigating and monitoring memory usage, thread details and activities like thread states(running, waiting, sleeping, blocked), memory allocation to threads. However, the JVM metrics we previously discussed offers only a high level overview and basic information of the profiles, falling short of providing the detailed insights necessary for in-depth analysis.
To facilitate that, various distinct sub metrics are generated from the captured metrics and data points, each offering detailed information about specific operations or activities within the system or application being monitored. Such sub metrics are known as Monitoring events. These events serve to provide comprehensive insights into how different aspects of the system are functioning and performing over time.
Some examples of Monitoring events are given below:
- CPU Load Event: Captures information of CPU utilization of the process, with thread-specific break-ups.
- Thread End Event: Captures information of a thread that gets terminated, along with the thread that is responsible for the termination of the ending thread.
- Class Loading Statistics: Captures the statistics of the classes loaded and unloaded within the JVM, during a specified time interval.
- Thread Allocation Statistics: Captures information on the number of bytes allocated to the threads (Both native and java threads) since the start of the thread.
Event Capturing Mechanism
So far, we’ve covered the different elements of monitoring system architecture, delving into the Agent (which starts the monitoring process), and exploring the types of data collected for monitoring and how this is captured. The ultimate goal from JVM monitoring is to capture various monitoring events. There are several mechanisms to capture monitoring events and some of the most commonly used mechanisms are outlined below.
The JVM Tool Interface (JVMTI)
The JVM Tool Interface (JVMTI) serves as an interface, facilitating interaction with the VM and the agent. The interface offers essential capabilities for inspecting and extracting the state and execution information within the JVM. It provides a rich set of APIs to access the JVM state and allows third parties to develop monitoring and profiling tools.
For example, when Class loading exceptions and problems occurs, it is wise to investigate the number of classes loaded during a process. JVMTI API GetLoadedClasses offers a way to get that specific information.
The below example is the syntax of the GetLoadedClasses JVMTI API that is used to monitor class loading aspects:
jvmtiEnv *jvmti;
//...
jvmtiError err = jvmti->GetLoadedClasses(&class_count, &classes);
To know about more JVMTI functions, refer JVM(TM) Tool Interface 1.2.3.
Java Management Extensions (JMX)
The JVM has built in capabilities that enables it to monitor and manage itself, using the Java Management Extensions (JMX) Technology. JMX offers a platform called MXBean. MXBean is a platform managed object that helps with monitoring and managing a component in the JVM in a full duplex mode which means the JVM can control as well as extract required data simultaneously. Each MXBean encapsulates a part of VM functionality such as Thread system, Operating System, Memory System, Garbage collector, and so on. Notable interfaces are:
- ThreadMXBean - offers several APIs that can track thread details like thread state, peak usage, thread CPU time from the JVM.
- OperatingMXBean - provides various APIs that can record the details about CPU utilization, processors etc.
- MemoryMXBean - facilitates APIs to inspect heap and non-heap memory usages.
When we encounter performance issues, high CPU utilization (80% or higher) is one of the major reason for that. To troubleshoot that, monitoring the CPU Profiles like total CPU usage, thread specific CPU usage are better options.
For example, in-order to measure total CPU time used by a thread, ThreadMXBean facilitates a way to that. Below, we give an example of ThreadMXBean that helps to extract thread CPU time:
ThreadMXBean tmxb = ManagementFactory.getThreadMXBean();
long cpuTime = tmxb.getThreadCpuTime(aThreadID);
To learn more about JMX interface, refer java.lang.management.
Tracing
The JVM in-built tracing is a facility provided by the JVM itself, designed to aid in diagnosing issues with minimal performance impact. The in-built tracing helps capture/extract the execution flow of the application with fine grained control on what to trace, when to trace and how to trace, etc. For example, we can trace methods from a specific class or a package, we can start / stop tracing when a specific event occurs in the JVM, or we can define what actions to be performed when a specific trace point is hit. With this feature, we can trace VM internal operations, java applications and methods, or any combinations of these. It is particularly useful when troubleshooting specific issues, identifying performance bottlenecks, or gaining insights into the interactions between various components. It provides granular details about method calls, system interactions, enabling precise root cause analysis. This level of detail is crucial for optimizing performance, identifying inefficient code paths, understanding complex workflows, and offering a comprehensive view of application’s runtime behavior.
An example of default tracing option is given below:
The default trace option ensures Java dumps always contain a record of most recent memory management history.
“ Javadump is a diagnostic file generated by the JVM that captures the snapshots of all threads and their states at a specific moment aiding in debugging and performance analysis. "
The ‘exception=j9mm{gclogger}’ clause of the default trace set specifies that a history of garbage collection cycles that have occurred in the VM is continuously recorded.
Suppose, we need to monitor the detailed information about thread end event. That is, the information about the thread that gets terminated, along with the thread that is responsible for the termination of the ending thread. Such details can be extracted using tracing option.
To explore more on Xtrace option and Tracing, refer to the -Xtrace - IBM Documentation
Summary and next steps
In this article, we introduced Java application monitoring, essential ingredients in JVM monitoring, metrics and their constituent events and mechanisms to capture those events. If you’re interested in continuing your learning of monitoring, check out the next article in this series, Java Monitoring 101: Tools, where we will explore common monitoring tools that are used in the Java ecosystem.
Alternatively, take a look at this free, in-depth learning course on edX: [Monitoring and Observability for Application Developers](IBM:Monitoring and Observability for Application Developers). This course provides a comprehensive overview of monitoring and observability, and teaches you the hands-on skills to employ monitoring, observability, and logging for your application.