Java, Semeru Runtimes and Runtimes for Business

Java, Semeru Runtimes and Runtimes for Business

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Extract the best Java performance on IBM Power11 server with IBM Semeru Runtimes

By Julian Wang posted 3 days ago

  

IBM Power servers continue to be the trusted foundation for hybrid IT infrastructure, delivering exceptional performance both on-premises and in the cloud. IBM Power processors are among the most consistent and best performing for wide ranges of workloads, including Java workloads, in the industry. Customers and partners alike highly value Power servers' corner-stone characteristics: robustness, reliability, scalability, and security. With the IBM Power11 servers' general availability in July 2025, this long history and tradition is expected not only to continue, but also flourish and improve further as well.

Characteristics and properties of Power11 servers
The IBM Semeru Runtimes are no-charge production-ready open-source binaries built with the OpenJDK class libraries and the Eclipse OpenJ9 JVM, which deliver the power and performance to run your Java applications when you need it most. Since Power10 GA in 2021, Java performance on POWER has improved greatly through the performance collaboration across IBM hardware, software and Semeru Runtimes. Hence in order to extract the best performance from Java applications running on the upcoming Power11 servers, it is critical to acquire the following IBM Semeru JDK releases or any future updates of them. They were designed, tested, and tuned on Power11 and are currently available to download:
  • IBM Semeru Runtime 21.0.7
  • IBM Semeru Runtime 17.0.15
  • IBM Semeru Runtime 11.0.27
  • IBM Semeru Runtime 8 jdk8u452
  • IBM SDK, Java Technology Edition, 8 SR8 FP45

If for any reason you are not able to upgrade the Semeru Runtimes JDK installed in your environment, you can avoid possible performance issues from an untuned JDK by running your Power11 system in Power10-compatibility mode.

Once you have Semeru Runtimes upgrade in place, there is absolutely no need to modify application code for reaping the performance benefits offered by Power11 servers, such that customers could be on the quickest go-to-market track possible for their products and Java applications targeting Power11. This also accelerates time-to-value of clients' hardware investments, while minimizing development and testing efforts. Based on extensive data gathered from various Java applications, when running with IBM Semeru Runtime 11.0.13 on Power10 versus IBM Semeru Runtime 21.0.7 on Power11, users should anticipate a performance enhancement of up to 15% in general, under identical resource settings (such as CPU count and physical memory size). Both Power10 and Power11 machines under tests below have identical configuration: 8-core/64-vCPUs with 128GB physical memory. When using only a subset of machine resources for the tests, identical subsets on both machines are used for fair and valid comparisons.

Performance of Java server workloads 2,3

The Java server benchmark was designed to evaluate the performance of servers running enterprise Java applications. It simulates a multi-threaded, compute-intensive workload, mimicking an online retailer with point-of-sale transactions, inventory management, and data mining operations. It assesses performance under various conditions, including different response-time requirements, and is used to analyze system bottlenecks at hardware, OS, JVM, and application layers. There are two key performance metrics measured: Peak throughput represents the maximum throughput achievable by the system, while throughput under SLA (Service Level Agreement for response-time) represents the throughput under a specific response-time limitation with a certain level of resource contention.

With Power11's better memory latency and Semeru Runtimes' improvements resulting in shorter garbage-collection duration in particular, we are seeing peak throughput improved by up to 21% from Power10 to Power11, while throughput under SLA achieved up to a whopping 55% uplift at the same time.

Java server workload peak throughput
Java server workload throughput under SLA

Performance of Application Server Middleware 4,5

Application servers are crucial for building and deploying modern web applications, providing the necessary infrastructure and services to handle complex business logic and user interactions efficiently and securely, either on-premises or in the cloud. IBM WebSphere Liberty is a lightweight, cloud-native Java application server designed for rapid development and deployment of modern Java enterprise applications, as well as microservices and cloud-native applications. Liberty is known for its ultra-fast startup time, low memory footprint, and modular architecture based on features. Liberty is also optimized for use in containerized environments like Kubernetes.

DayTrader7 is an application built around the paradigm of an online stock trading system. The application allows users to login, view their portfolio, look up stock quotes, and buy or sell stock shares. With the aid of a web-based load driver such as Apache JMeter, the real-world workload provided by DayTrader7 can be used to measure and compare the performance of Java Enterprise Edition (Java EE) application servers offered by a variety of vendors. DayTrader7's design spans Java EE 7, including the WebSockets specification. Other Java EE features include JSPs, Servlets, EJBs, JPA, JDBC, JSF, CDI, Bean Validation, JSON, JMS, MDBs, and transactions (synchronous and asynchronous/2-phase commit). Throughput is measured running DayTrader7 on IBM WebSphere Liberty, with IBM Db2 database as the backend.

DayTrader7 throughput
Acme Air is an open-source benchmark application for Java MicroServices. It simulates a fictitious airline called Acme Air which handles flight bookings. The application was built with some key business requirements in design: the ability to scale to billions of web API calls per day, the need to develop and deploy the application in public clouds (as opposed to dedicated pre-allocated infrastructure), and the need to support multiple channels for user interaction (with mobile enablement first and browser/Web 2.0 second). Throughput is measured running Acme Air on IBM WebSphere Liberty, with MongoDB database as the backend.
AcmeAir throughput
Due to the complexity of these three-tier environments, it is typically difficult to achieve full-stack scalability and/or performance improvements. The fact that we are seeing double-digit better results for both applications on Power11 just reaffirms the position that IBM WebSphere Liberty with Semeru Runtimes on Power11 provides an excellent IT infrastructural foundation for business applications.

Performance optimization for OpenShift and container environment 6,7,8,9

With Java application styles spanning from traditional monoliths to microservices to serverless applications (which can be provisioned in containers on-demand), ultra-fast application startup is a key factor in achieving high elasticity, responsiveness, and cost efficiency. Built upon Linux’s CRIU technology, IBM WebSphere Liberty InstantOn on Linux on Power aims to reduce start-up time for Java applications by providing a seamless checkpoint / restore solution for developers. By taking a snapshot of your running JVM process and including it into your containerized application image, you can spin new containers by quickly restoring from the checkpoint upon deployment. On Linux® on Power11, enabling the InstantOn feature in IBM WebSphere Liberty 25.0.0.5 with IBM Semeru Runtime 21.0.7 provides an average of up to 50 times faster start-up time compared to without it.  It also provides on average up to 24 times faster first-response time. Besides start-up and first-response time improvements, the InstantOn feature still allows for full Java language support, high throughput, low memory footprint, along with the use of existing developer tooling. The following charts compare the start-up times using default Java options vs InstantOn technology, for a variety of IBM WebSphere Liberty applications measured with IBM Semeru Runtime 21.0.7 and IBM WebSphere Liberty 25.0.0.5 on Power11:

PingPerf start-up
AcmeAir start-up
PingPerf first-response
AcmeAir first response

If you are interested in more technical details on how Semeru InstantOn uses the Linux CRIU technology, check out this article. For more information of using InstantOn with Open Liberty, see openliberty InstantOn

Performance of DaCapo benchmark suite 10

The DaCapo benchmark suite is a collection of open-source, real-world Java applications designed for performance analysis and benchmarking. It incorporates the latest Java features and uses popular Java frameworks as well. It's a valuable tool for researchers and developers working with Java, particularly in areas like garbage collection, memory management, and compiler optimization. The suite includes various applications with non-trivial memory loads, providing a realistic environment for evaluating Java Virtual Machine (JVM) and system performance.

Dacapo benchmark suite spans over a wide range of real Java SE application scenarios: cassandra, biojava, h2o, eclipse, sunflow, spring, and tomcat, etc. These cover a lot of ground in popular Java programming frameworks from database, bio-genome, AI, image processing, to microservice middleware web-application. The fact that we can achieve 14.4% geomean uplift generation to generation indeed is a significant demonstration of close collaboration between Semeru Runtimes and Power11 hardware teams. This result speaks volumes of Semeru Runtimes' versatility, its capability to leverage Power11 strengths, and its own sustained performance improvements over the years.

Dacapo performance

Performance of Big Data workloads 11

Enterprise data are the best fuel for accurate and differentiated AI that is relevant to your industry and your clients to drive competitive advantage. However, 90% of enterprise data are unstructured data which have largely remained inaccessible and underutilized for Gen AI. IBM watsonx.data is the only hybrid, open-data lakehouse for enterprise AI and analytics. So now, you can access, prepare, and deliver your enterprise unstructured data to achieve 40% more accurate AI than conventional RAG with IBM watsonx.data. We picked one application sample as the benchmark for performance comparison: TPC-H on Apache Spark with Amazon S3 data.

Big data performance
The 2x better result showcases both the software enhancements in Semeru Runtimes and the suitability of Power11 for big data workloads. Our performance improvement strategy turns out very well for having a (re)focus on watsonx.data workloads and close collaboration from hardware and software teams across IBM.

Semeru Runtimes: your first choice for Java performance on Power11 

The performance results presented above demonstrate the vibrancy of Java on the Power platform exemplified by the Semeru Runtimes' investment in technical innovations over the past few years. They also showcase Semeru Runtimes' suitability for running wide spectrum of Java applications on Power11 servers for customers to quickly reach the target of return on their hardware and software investments. The fact that it performs for these many popular Java frameworks signifies it would help accelerate clients to achieve their go-to-market goals fast.


How to acquire IBM Semeru Runtimes

For both Linux and AIX on Power, IBM Semeru Runtime Open Edition and Certified Edition are available from this website.

Disclaimer(*)

  1. All performance data contained in this blog were obtained in the specific operating environment and under the conditions described above and is presented as an illustration. Performance obtained in other operating environments may vary and clients/readers should conduct their own testing.
  2. Based on 8-core configuration with 128GB physical memory at 100% utilization under typical operating conditions where Power E1080 peak throughput is 52,958 transactions per second, Power E1180 peak throughput is 64,297 transactions per second. Performance results collected using Java server benchmark. For 8-core with 128GB physical memory configuration, two groups were run on the SUT (System Under Test) for throughput measurement. Run configurations may change and results may vary.

  3. Based on 8-core configuration with 128GB physical memory at 100% utilization under typical operating conditions where Power E1080 throughput under SLA is 13,397 transactions per second, Power E1180 throughput under SLA is 20,816 transactions per second. Performance results collected using Java server benchmark. For 8-core with 128GB physical memory configuration, two groups were run on the SUT (System Under Test) for throughput measurement. Run configurations can change and results may vary.
  4. Application server JVM was bound to run on 8 logical processors with 1GB Java heap at 100% utilization under typical operating conditions where Power E1080 throughput is 5,507 pages/s, Power E1180 throughput is 6,362 pages/s. Daytrader application: https://github.com/WASdev/sample.daytrader7. Results may vary.

  5. Acme Air application JVM was bound to run on 8 vCPUs with 2GB Java heap at 100% utilization under typical operating conditions where Power E1080 throughput is 23,362 tps, Power E1180 throughput is 20,395 tps. Acme Air application container JVM used options: -Xms2048m -Xmx2048m -Xshareclasses:none; JMeter was used to drive both workloads. Acme Air: https://github.com/acmeair/acmeair. Results may vary. 
  6. Based on 8-core configuration with 128GB physical memory under typical operating conditions where Power E1080 started up the PingPerf application server container in 3.2 seconds, Power E1180 started up the PingPerf application server container with InstantOn technology in 0.058 seconds. Pingperf: https://sourceforge.net/projects/pingperf.  Timing results may vary
  7. Based on 8-core configuration with 128GB physical memory under typical operating conditions where Power E1080 started up the Acme Air application server container in 6.5 seconds, Power E1180 started up the Acme Air application server container with InstantOn technology in 0.134 seconds. Acme Air: https://github.com/acmeair/acmeair. Timing results may vary

  8. Based on 8-core configuration with 128GB physical memory under typical operating conditions where the PingPerf application container on Power E1080 had the first transactional response in 3.35 seconds, the PingPerf application container on Power E1180 with InstantOn technology had the first transactional response in 0.144 seconds. Timing results may vary

  9. Based on 8-core configuration with 128GB physical memory under typical operating conditions where the Acme Air application container on Power E1080 had the first transactional response in 6.65 seconds, the Acme Air application container on Power E1180 with InstantOn technology had the first transactional response in 0.265 seconds. Timing results may vary

  10. Based on 8-core configuration with 128GB physical memory under typical operating conditions where Power E1080 completed each sub-benchmark  in 7.422 seconds geomean, Power E1180 completed each sub-benchmark in 6.516  seconds geomean. Dacapo bench: https://www.dacapobench.org. Results may vary.
  11. Based on 4-core configuration with 64GB physical memory at 100% utilization under typical operating conditions where Power E1080 ran to completion in 9,738 seconds, Power E1180 ran to completion in 5,058 seconds.  Spark: https://spark.apache.org. Timing results may vary.

0 comments
4 views

Permalink