AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

Leverage AI on IBM Z & LinuxONE to enable real-time AI decisions at scale, accelerating your time-to-value, while ensuring trust and compliance

 View Only

Accelerating Enterprise AI with AI Toolkit for IBM Z and LinuxONE

By ABRAHAM VARGHESE posted 2 days ago

  

      Open source is the foundation of modern AI—driving rapid advancements through community collaboration, transparency, and flexibility. IBM has embraced this ecosystem by integrating leading open-source AI frameworks through AI Toolkit optimized for IBM Z and LinuxONE. With the AI Toolkit, developers and data scientists can work with familiar tools enhanced by IBM Z’s Integrated Accelerator for AI, an on-chip AI inferencing accelerator on Telum II—to efficiently build, scale, and deploy AI models. This approach combines the openness and agility of community-driven development with the enterprise-grade performance, security, and reliability that IBM Z is known for.

      A key advancement is the support for INT8 quantization, significantly improving inference efficiency. Compared to non-quantized models  INT8 delivers up to 2× faster performance and up to 2× lower memory usage than FP16—reducing latency and resource demands. Combined with optimized open-source frameworks, the AI Toolkit increases efficiency through quantization and support for multiple model strategies, enabling data scientists to adopt hardware-optimized inference with minimal model changes.

AI Toolkit enables you to:

  • Train/Build models on any platform
  • Integrate AI inferencing services in existing application runtimes
  • Quickly deploy models on IBM LinuxONE, Linux on IBM Z & zCX (z/OS Container Extensions)
  • Maintain stringent SLAs requirements

The AI Toolkit for IBM Z and IBM LinuxONE Selected Support Elite is a curated, enterprise-grade offering that provides IBM Elite Support for strategic AI open-source frameworks—including non-warranted machine learning, deep learning, and high-performance AI-serving libraries that are specifically optimized for IBM Z and LinuxONE.

Today, we’re pleased to announce a significant milestone in that journey. The AI Toolkit for IBM Z and LinuxONE has been enhanced with support for the new Telum II processor, now available on IBM z17 and LinuxONE 5. These updates unlock new levels of performance, efficiency, and real-time inferencing—all while preserving the industry-leading security, availability, and scalability clients expect from the platform.

Why This Matters: Telum II on IBM z17 and LinuxONE 5

The recent launch of IBM z17 and LinuxONE 5 introduced the Telum II processor, which delivers the most advanced on-chip AI acceleration in IBM Z history.

This advancement allows organizations to

  • Run multiple-model AI inferencing—combining predictive AI with encoder-based LLMs to drive higher accuracy
  • Execute AI inferencing in-transaction, eliminating latency and enabling decisions in real time
  • Scale AI securely, with built-in resiliency and compliance support

What’s New in the AI Toolkit: Telum II-Optimized Components

All components of the AI Toolkit are designed to leverage IBM Z’s Integrated Accelerator for AI, the on-chip AI inferencing accelerator. With this release, we have specifically updated the following to take advantage of Telum II:

  • IBM Z Accelerated for PyTorch v1.2
  • IBM Z Accelerated for TensorFlow v1.4
  • IBM Z Accelerated Serving for TensorFlow v1.4
  • IBM Z Accelerated for Snap ML v1.4
  • IBM Z Accelerated for NVIDIA Triton Inference Server  v1.4
  • IBM Z Deep Learning Compiler v5.0

These updates enable significantly improved AI inference throughput, support multiple AI model strategies, and offer a consistent experience across the AI lifecycle—from model training to real-time inferencing.

To better understand how these technologies work—and how to apply them across enterprise AI use cases—explore the following product page on each of the updated AI Toolkit components:

  • IBM Z Accelerated for PyTorch v1.2: Explore how PyTorch is optimized for Telum II to enable high-performance training and inferencing for deep learning and LLMs, with native support for encoder-based models and transformers. Ideal for AI teams working on NLP, computer vision, and hybrid AI strategies.
  • IBM Z Accelerated for TensorFlow v1.4: Learn how TensorFlow has been tuned for IBM Z to deliver efficient large-scale training and deployment using Telum II—supporting high-throughput AI inferencing across real-time enterprise applications.
  • IBM Z Accelerated Serving for TensorFlow v1.4: See how the serving layer has evolved to support high-speed, scalable TensorFlow inferencing in production. Perfect for customers who require responsive AI integrated into transactional systems.
  • IBM Z Accelerated for Snap ML v1.4: Read about Snap ML’s enhancements for Telum II—offering efficient training and real-time scoring for traditional ML models like decision trees and logistic regression. An essential tool for fraud detection, risk scoring, and compliance automation.
  • IBM Z Accelerated for NVIDIA Triton Inference Server  v1.4: Discover how Triton Inference Server delivers flexible, scalable deployment of multiple AI models—supporting concurrent execution, dynamic batching, and ensemble models—now fully optimized for Telum II. Ideal for production environments requiring low-latency inferencing at scale across heterogeneous models.
  •  IBM Z Deep Learning Compiler v5.0: Understand how zDLC allows developers to compile deep learning models (e.g., ONNX-based) into optimized binaries for execution on IBM Z and LinuxONE with Telum II. Designed to reduce dependencies, simplify deployment, and maximize performance across AI pipelines using Python, C++, or Java APIs.

Designed for the Full AI Lifecycle

The AI Toolkit aligns with IBM’s full-stack AI lifecycle for Z and LinuxONE:

1.      Define & Prepare: Leverage IBM Synthetic Data Sets or client data

2.      Build & Train: Use PyTorch, TensorFlow, or Snap ML frameworks

3.      Format & Optimize: Convert models using ONNX, PMML, or JSON

4.      Deploy & Serve: Use Triton Inference Server or TensorFlow Serving

5.      Monitor & Govern: Built-in tooling for observability and compliance

Whether you're starting with a pretrained model or building from scratch, the AI Toolkit enables secure, high-performance AI within existing enterprise workflows.

Get Started Today

Explore, build, and deploy with IBM Z and LinuxONE:

·       AI on Z 101 – Framework guides and best practices

·       Solution Templates – Guided templates with real-world use cases to jumpstart adoption

·       Discovery Workshop – Contact AIonZ@us.ibm.com for a free of charge workshop where our team helps to define a use case and define an MVP proof of concept

4 comments
18 views

Permalink

Comments

19 hours ago

Great Content ABV !!

2 days ago

Awesome, excited to leverage these popular frameworks all on Z.

2 days ago

Awesome Content !! Great Capabilities with this Offering.. Keep going !! Thank you.

2 days ago

Exciting new capabilities!