Why Your App Freezes at the Worst Possible Time
You've built a system optimized for speed and scale, handling thousands of API calls per second. But as traffic surges, maintaining SLA compliance becomes increasingly challenging. No anomalies in the logs, CPU appears healthy, but the user experience is suffering. Why?
Behind the scenes, the JVM may be pausing your application threads to reclaim memory. Such stoppages are called garbage collection (GC) pauses-and while it's normal, it can wreak havoc in systems with strict response time service level agreements (SLAs). Even a 200 ms pause can cause hundreds of requests to stall, disrupting dependent transactions and causing poor user experiences.
For workloads where predictability is non-negotiable—financial transactions, healthcare systems, IoT—these pauses aren’t just inconvenient; they’re business-critical risks.
While the default Gencon GC policy with IBM Semeru Runtimes is great for throughput, but its stop-the-world (STW) phases may introduce unpredictable latency spikes. Enter Pause-less GC mode: a solution designed to minimize and smooth pause times, making them shorter and more predictable. While GC pauses still occur, they’re significantly reduced, helping your app stay responsive and meet SLAs. The result? Lower latency, better predictability, and peace of mind for mission-critical systems.
Why Response Time Matters
Modern applications often operate under strict SLAs—such as 50 ms per transaction for payment gateways or retail systems. Missing these targets doesn’t just degrade performance; it can lead to:
- Lost revenue from abandoned transactions.
- Regulatory penalties in fiance or healthcare.
- Customer churn due to poor experience.
In high-volume environments, even small latency spikes can cascade into massive backlogs. A single GC pause may stall thousands of requests.
The Challenge with Traditional GC
The Gencon GC policy in IBM Semeru Runtimes is designed to strike a balance between throughput and pause times, making it a great choice for many workloads.
However, Gencon relies on STW phases during collection cycles, which can introduce unpredictable latency spikes under heavy load. For applications with strict SLAs, even occasional pauses of 100–200 ms can cause cascading delays. This isn’t a flaw—it’s simply a trade-off that works well for throughput-focused systems but can be challenging for latency-sensitive environments.
A common approach to reducing GC pauses is to allocate much larger heaps, which lowers the frequency of collections. Another option is to add more hardware threads to improve GC parallelism. While these strategies can help, they come with higher infrastructure costs and complexity. Instead of scaling hardware or memory, a more strategic approach is to change how garbage collection operates—this is where Pause-less GC comes in. Please refer to IBM Semeru Runtimes GC Documentation for more information about Gencon and STW phases.
Introducing Pause-less GC Mode
Pause-less GC is a special mode of the Gencon GC policy in IBM Semeru Runtimes, designed to deliver more predictable pause times for applications that require consistent response times. It builds on Gencon's generational approach and introduces advanced techniques to further reduce pause duration, primarily by parallelizing the garbage collection of short-lived objects in the Java heap.
Key features include:
- Concurrent work: GC tasks run alongside application threads, with short-lived object collection parallelized to minimize pause times.
- Incremental processing: Large operations are broken into smaller steps to avoid long pauses.
- Pause smoothing: Instead of rare but large pauses, Pause-less GC introduces smaller, more frequent pauses that are easier to absorb.

To put the two into perspective, picture driving on a highway. With Gencon GC, you hit frequent STOP signs—every car halts completely while cleanup happens, causing long delays. Pause-less GC replaces those stops with YIELD signs—brief slowdowns instead of full stops, like smooth speed bumps. The result? A steady, predictable flow instead of frustrating traffic jams.
And on IBM Z, Pause-less GC gets an extra boost by leveraging the Guarded Storage Facility (GSF) introduced in IBM z14 hardware. GSF accelerates concurrent GC operations by providing hardware-assisted memory protection, reducing pathlength for reference checks and improving pause-time predictability even further. Please refer to IBM Semeru Runtimes GC documentation for further information on Pause-less GC.
Important: Pause-less GC eliminates most of the STW pause-times but does not remove them entirely. While average pause-times should improve with Pause-less GC, the worst-case pause time may not be reduced, as compared to the default Gencon GC.
Is this right for your application?
Before jumping further into Pause-less GC, it’s important to know whether your application will actually benefit from it. A quick way to check is by analyzing your verbose GC logs. If you see high average pause times—especially if they exceed your SLA targets—your application could potentially benefit from Pause-less GC.
To illustrate this, we used a benchmark that mirrors real-world complexity. This multi-tier retail workload simulates in-store and online transactions, payment processing, inventory updates, and analytics like sales trend reporting—all running concurrently. The goal is to stress the JVM under realistic conditions and measure two critical factors: throughput and SLA compliance, where compliance means 99% of transactions must be completed within the specified time limit. The benchmark enforces three SLA tiers—10 ms for ultra-low latency, 50 ms for interactive workloads, and 100 ms for predictable high-volume operations.
We need to enable verbose GC logs via -Xverbosegclog:<filename> to capture detailed garbage collection activity, with <filename> specifying the path and file name for the output log. These insights are essential for determining whether Pause-less GC is the right choice for your application. Please refer to IBM Sermu Runtimes documentation for further information on verbose gc logs.
The logs are generated in XML format and include a summary of GC configuration along with detailed information about each GC cycle—for example, start time, duration, cycle type, and the trigger for the cycle. To make analysis easier, you can use IBM Monitoring and Diagnostic Tool – Garbage Collection and Memory Visualizer (GCMV) to visualize and interpret the data:
- Load verbose GC file into GCMV
- Open “Report” tab from the bottom of the GCMV window
- From “Report” tab you can either click on “Pause time” on the left or scroll down to pause time
Above pause times for our multi-tier retail simulation benchmark running with Gencon GC highlight its limitations for latency-sensitive workloads. The application recorded an average pause time of 318 ms, with peaks reaching 333 ms. These long, unpredictable pauses made it impossible to meet SLA targets without throttling throughput. If your application exhibits similar characteristics—high pause times and strict SLA requirements—it could also benefit from Pause-less GC, which is designed to reduce and smooth pause duration for more predictable performance under load.
Enabling Pause-less GC
The Pause-less GC mode is not on by default but can be enabled by specifying -Xgcpolicy:gencon -Xgc:concurrentScavengeJVM options. To verify that Pause-less GC is enabled correctly with hardware assistance confirm that concurrentScavenger attribute in verbose gc log is set as shown below:
<attribute name="concurrentScavenger" value="enabled, with H/W assistance" />
Pause-less Garbage Collection is available in 64 bit IBM Sermeru Runtimes and IBM Java SDKs for z/OS and LoZ. If any of the requirements are not met -Xgc:concurrentScavenge option is ignored. For complete list of requirements please refer to IBM Semeru GC Documentation.
Pause-less GC in Action
When we first ran this benchmark using the default Gencon GC policy, we faced a tough trade-off: to meet strict SLA targets, we had to reduce application throughput. Gencon’s stop-the-world pauses introduced unpredictable latency spikes, making it impossible to sustain high transaction rates without violating SLAs.
Pause-less GC helped us overcome that limitation. By introducing concurrent and incremental collection, it allowed us to maintain SLA compliance without throttling performance, resulting in higher throughput and smoother response times.
On IBM z17 running IBM Semeru Runtimes Certified Edition 17.0.15, our benchmark achieved up to 44x reduction in average GC pause times with Pause-less GC. Beyond lowering the average pause, it also slashed the total time spent in GC and significantly reduced the minimum and maximum pause duration per cycle.
Throughput improved by up to 3.1x because Pause-less GC allowed the application to sustain higher transaction rates while still meeting SLA targets. Pause-less GC makes this possible by reducing and smoothing GC STW phase pauses, which removes the need for throughput throttling.
Conclusion
For applications where response time is mission-critical, unpredictable latency is the enemy. A single GC pause can cascade into thousands of stalled requests, breaking SLAs and damaging user trust. Traditional GC strategies like Gencon are excellent for throughput, but they trade predictability for performance—a compromise that doesn’t work when every millisecond matters.
Pause-less GC addresses this gap by prioritizing consistency under load. Instead of rare but disruptive spikes, it delivers smoother, shorter pauses that keep systems responsive even during peak traffic. This isn’t just a tuning option—it’s a safeguard for businesses where downtime equals lost revenue or regulatory risk.
If your business depends on consistent response times, Pause-less GC is worth considering.