AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

Leverage AI on IBM Z & LinuxONE to enable real-time AI decisions at scale, accelerating your time-to-value, while ensuring trust and compliance

 View Only

Unlocking business growth and operational efficiency with AI on IBM LinuxONE

By Elpida Tzortzatos posted Tue May 06, 2025 12:06 AM

  

Artificial intelligence (AI) has been a hot topic in technology for the past decade, but generative AI has thrust AI into global headlines and launched a surge of AI innovation and adoption.

In my conversations with clients this past year, I’ve heard the excitement around the business potential of what AI can deliver – faster and more accurate insights that lead to more business value, and operational efficiencies with generative AI and Agentic AI. At the same time, I hear validation of the challenges enterprises are facing with scaling AI – the strain on infrastructure and costs with supporting more than 100x model parameters1, the potential 10x energy consumption in datacenters due to increased AI workloads2, meeting low-latency requirements for high volume workloads, and security concerns around data privacy for AI model use.

With these increased AI workloads, having the right infrastructure is at the core of creating and delivering value with AI. IBM® LinuxONE 5 has additional hardware capabilities in the IBM Telum II™ processor and IBM Spyre™ Accelerator, expected to be available 4Q 2025 via PCIe card, along with a supporting software stack that addresses these requirements and is designed to enable AI deployment in a scalable, power efficient, and secured way.

Telum II™ and multiple model AI techniques 

In IBM LinuxONE 5, multiple hardware enhancements enable a wider set of AI models supported for AI inferencing acceleration. IBM LinuxONE 5 is designed to process up to 5 million inference operations per second with less than 1 millisecond response time using a Credit Card Fraud Detection Deep Learning Model3. Additionally, running AI workloads is more power efficient on LinuxONE – we saw savings up to 83% of power consumption by replacing a compared x86 solution comprised of two-year-old servers running AI-infused OLTP workloads with an IBM LinuxONE 54

  • Telum II processor features the 2nd generation of the on-chip AI accelerator allowing AI inferencing to happen as close to transactions as possible for optimized response time performance.
  • Telum II boasts a new data processing unit (DPU) which is engineered to accelerate complex input/output (I/O) protocols for networking and storage on LinuxONE 5 , while reducing the power needed by over 90% compared to LinuxONE 45
  • Advanced compute primitives for encoder large language models (LLMs) as well as newly added INT8 quantization support, and enhanced matrix operations for efficiency.
  • Intelligent routing capabilities enable applications to access all the AI accelerators across the processor drawer, even those not on the same physical chip. 

With the support of multiple AI models, clients can combine the power of predictive AI with encoder LLMs to further enhance their models for business use cases. IBM LinuxONE 5 is designed to help prevent financial fraud with advanced multiple AI model technique, a challenge that cost banks $448 billion globally in 20246. Additionally, IBM LinuxONE 5 is designed to help prevent insurance fraud with advanced multiple AI model technique to effectively identify urgent home insurance claims for swift resolution leading to improved customer experience, a challenge that costs the insurance industry $83 billion globally6.

Spyre™ Accelerator and generative AI

To support our clients in generative AI capabilities, we have continued to make investments in IBM research innovation that we are bringing to market. One of which is the Spyre Accelerator, upon its availability in 4Q 2025 via PCIe card, that aims to transform the user experience on the IBM LinuxONE 5 platform by running generative AI encoder and decoder models that enable use cases such as document summarization and image analysis. Clients can achieve AI business goals, user and operational efficiencies, and productivities leveraging Gen AI chatbots such as IBM watsonx Assistant for Z and LinuxONE.

AI open standards and tooling optimized for IBM LinuxONE 5

Having the right hardware for AI inferencing acceleration is foundational, but needs an optimized software stack as well to leverage the hardware capabilities. We have provided AI open standards and tooling that is optimized for IBM LinuxONE 5, through popular frameworks in AI Toolkit for LinuxONE and through supporting Red Hat OpenShift AI. 

  • AI Toolkit for LinuxONE which includes popular open-source AI frameworks and tooling that are optimized to take full advantage of Telum II and are paired with IBM Elite Support including:
    • PyTorch
    • TensorFlow
    • TensorFlow Serving
    • NVIDIA Triton Inference Server
    • SnapML
    • IBM Z Deep Learning Compiler
  • Red Hat OpenShift AI on IBM LinuxONE combines Red Hat’s expertise with the powerful capabilities of IBM LinuxONE, offering organizations a robust foundation for building, training, deploying, and monitoring AI models across hybrid cloud settings. While the current Tech Preview provides essential functionalities, such as model serving, it also highlights the potential for future enhancements that could expand its capabilities significantly. Organizations can look forward to:
    • Simplified AI adoption, providing freedom of choice and access to the latest innovations in AI/ML technologies.
    • Improved operational consistency by streamlining the process of moving models from experiments to production.
    • Greater flexibility in deploying models across various environments, including on-premises, cloud, and edge locations.
    • With advanced security, scalability, and reliability, Red Hat OpenShift AI is designed to unlock new possibilities for AI innovation, making it an exciting prospect for businesses aiming to stay ahead in their respective fields.

We found that data scientists and AI practitioners wanted to use the tools they were familiar with on the platform of their choice, hence our investment in supporting and optimizing these popular open-source frameworks for AI agility on the LinuxONE platform.


Early access to Red Hat OpenShift AI on IBM LinuxONE is now available as Tech Preview.

IBM LinuxONE Ecosystem for AI

These additional AI capabilities I mentioned above are available to our IBM LinuxONE Ecosystem partners as well, where Independent Software Vendors (ISVs) are partnering with IBM to provide enterprise solutions for today’s AI challenges. These partners can integrate their solutions with IBM LinuxONE and an optimized software stack to take advantage of the native AI acceleration within the platform to drive business value with clients. 

Take your next steps with AI on IBM LinuxONE

  1. To get started, we would love to host you for a no-charge AI on LinuxONE Discovery Workshop to help you evaluate potential use cases and define a project plan. If you would like to schedule a customizable workshop on any of the AI use cases or products, email us at aionz@us.ibm.com.
  2. To learn more about how you can unlock new business value by harnessing the power of Linux and AI, join us for the virtual event “Unlock the Potential of Linux and AI with IBM LinuxONE” on Tuesday, May 13, 2025 at 10am ET. Reserve your spot here.
  3. For more information about IBM LinuxONE’s AI capabilities, visit our AI on IBM LinuxONE webpage to explore how the platform enables you to deploy AI inferencing at scale. 

Disclaimers

  1. >100x (500x) parameter growth for LLMs The 7 Biggest Artificial Intelligence (AI) Trends In 2022 (forbes.com)
  2. 10x Increase In Energy consumption The AI Boom Could Use a Shocking Amount of Electricity - Scientific American
  3. DISCLAIMER: Performance result is extrapolated from IBM® internal tests running on IBM Systems Hardware of machine type 9175. The benchmark was executed with 1 thread performing local inference operations using a LSTM based synthetic Credit Card Fraud Detection (CCFD) model (https://github.com/IBM/ai-on-z-fraud-detection) to exploit the IBM Integrated Accelerator for AI. A batch size of 160 was used. IBM Systems Hardware configuration: 1 LPAR running Red Hat® Enterprise Linux® 9.4 with 6 cores (SMT), 128 GB memory. Results may vary
  4. DISCLAIMER: Based on IBM® internal performance tests running on IBM Systems Hardware of machine type 9175 compared to the same tests running on a commercially available enterprise server with 2x 28 Intel® Xeon® Gold 5420+ CPU @ 2.20 GHz. The MegaCard benchmark (https://github.com/IBM/megacard-standalone) is a containerized IBM WebSphere Liberty v24 online transaction processing (OLTP) application deployed on Red Hat® OpenShift® Container Platform (RHOCP) 4.17 on Red Hat Enterprise Linux® (RHEL) 9.4 with KVM. EDB Postgres for Kubernetes v1.25 is used as the database. The model extrapolated the test results to a typical, complete customer IT solution that includes isolated from each other production and non-production IT environments. On the IBM z17 side the complete solution requires one IBM z17 Type 9175 MAX 136, and on x86 side, the complete IT solution requires 72 of the compared servers. Results may vary. 
  5. CLAIM: The Telum II DPU reduces the power needed for I/O management for a large IBM LinuxONE Emperor 5 system by over 90% compared to a similarly configured IBM LinuxONE Emperor 4. 
    DISCLAIMER : Comparison based on IBM lab measurements for the difference in power required for supporting I/O for FICON and OSA in an expected large IBM Machine Type 9175 configuration based on an actual historical large IBM Machine Type 3931 configuration. IBM Machine Type 9175 is Max 208 with 23 TB memory, 56 active processors, 3 IBM Virtual Flash Memory, 14 ICA-SR 2.0, 7 PCIe+ I/O drawers with 69 FICON Express32 – 4P LX, 12 OSA-Express7S 1.2 GbE SX, 18 Network Express LR 10G, and 4 Crypto Express 8S (2 HSMs). The IBM Machine Type 3931 is configured to provide the same hardware capability. Results may vary. 
  6. Banking industry fraud numbers are from the Celent paper “Mitigating fraud in the AI age” which was commissioned by IBM. 
0 comments
24 views

Permalink