Open source is the foundation of modern AI—driving rapid advancements through community collaboration, transparency, and flexibility. IBM has embraced this ecosystem by integrating leading open-source AI frameworks through AI Toolkit optimized for IBM Z and LinuxONE. With the AI Toolkit, developers and data scientists can work with familiar tools enhanced by IBM Z’s Integrated Accelerator for AI, an on-chip AI inferencing accelerator on Telum II—to efficiently build, scale, and deploy AI models. This approach combines the openness and agility of community-driven development with the enterprise-grade performance, security, and reliability that IBM Z is known for.
A key advancement is the support for INT8 quantization, significantly improving inference efficiency. Compared to non-quantized models INT8 delivers up to 2× faster performance and up to 2× lower memory usage than FP16—reducing latency and resource demands. Combined with optimized open-source frameworks, the AI Toolkit increases efficiency through quantization and support for multiple model strategies, enabling data scientists to adopt hardware-optimized inference with minimal model changes.
AI Toolkit enables you to:
- Train/Build models on any platform
- Integrate AI inferencing services in existing application runtimes
- Quickly deploy models on IBM LinuxONE, Linux on IBM Z & zCX (z/OS Container Extensions)
- Maintain stringent SLAs requirements
The AI Toolkit for IBM Z and IBM LinuxONE Selected Support Elite is a curated, enterprise-grade offering that provides IBM Elite Support for strategic AI open-source frameworks—including non-warranted machine learning, deep learning, and high-performance AI-serving libraries that are specifically optimized for IBM Z and LinuxONE.
Today, we’re pleased to announce a significant milestone in that journey. The AI Toolkit for IBM Z and LinuxONE has been enhanced with support for the new Telum II processor, now available on IBM z17 and LinuxONE 5. These updates unlock new levels of performance, efficiency, and real-time inferencing—all while preserving the industry-leading security, availability, and scalability clients expect from the platform.
Why This Matters: Telum II on IBM z17 and LinuxONE 5
The recent launch of IBM z17 and LinuxONE 5 introduced the Telum II processor, which delivers the most advanced on-chip AI acceleration in IBM Z history.
This advancement allows organizations to
- Run multiple-model AI inferencing—combining predictive AI with encoder-based LLMs to drive higher accuracy
- Execute AI inferencing in-transaction, eliminating latency and enabling decisions in real time
- Scale AI securely, with built-in resiliency and compliance support
What’s New in the AI Toolkit: Telum II-Optimized Components
All components of the AI Toolkit are designed to leverage IBM Z’s Integrated Accelerator for AI, the on-chip AI inferencing accelerator. With this release, we have specifically updated the following to take advantage of Telum II:
- IBM Z Accelerated for PyTorch v1.2
- IBM Z Accelerated for TensorFlow v1.4
- IBM Z Accelerated Serving for TensorFlow v1.4
- IBM Z Accelerated for Snap ML v1.4
- IBM Z Accelerated for NVIDIA Triton Inference Server v1.4
- IBM Z Deep Learning Compiler v5.0
These updates enable significantly improved AI inference throughput, support multiple AI model strategies, and offer a consistent experience across the AI lifecycle—from model training to real-time inferencing.
To better understand how these technologies work—and how to apply them across enterprise AI use cases—explore the following product page on each of the updated AI Toolkit components:
- IBM Z Accelerated for PyTorch v1.2: Explore how PyTorch is optimized for Telum II to enable high-performance training and inferencing for deep learning and LLMs, with native support for encoder-based models and transformers. Ideal for AI teams working on NLP, computer vision, and hybrid AI strategies.
- IBM Z Accelerated for TensorFlow v1.4: Learn how TensorFlow has been tuned for IBM Z to deliver efficient large-scale training and deployment using Telum II—supporting high-throughput AI inferencing across real-time enterprise applications.
- IBM Z Accelerated Serving for TensorFlow v1.4: See how the serving layer has evolved to support high-speed, scalable TensorFlow inferencing in production. Perfect for customers who require responsive AI integrated into transactional systems.
- IBM Z Accelerated for Snap ML v1.4: Read about Snap ML’s enhancements for Telum II—offering efficient training and real-time scoring for traditional ML models like decision trees and logistic regression. An essential tool for fraud detection, risk scoring, and compliance automation.
- IBM Z Accelerated for NVIDIA Triton Inference Server v1.4: Discover how Triton Inference Server delivers flexible, scalable deployment of multiple AI models—supporting concurrent execution, dynamic batching, and ensemble models—now fully optimized for Telum II. Ideal for production environments requiring low-latency inferencing at scale across heterogeneous models.
- IBM Z Deep Learning Compiler v5.0: Understand how zDLC allows developers to compile deep learning models (e.g., ONNX-based) into optimized binaries for execution on IBM Z and LinuxONE with Telum II. Designed to reduce dependencies, simplify deployment, and maximize performance across AI pipelines using Python, C++, or Java APIs.
Designed for the Full AI Lifecycle
The AI Toolkit aligns with IBM’s full-stack AI lifecycle for Z and LinuxONE:
1. Define & Prepare: Leverage IBM Synthetic Data Sets or client data
2. Build & Train: Use PyTorch, TensorFlow, or Snap ML frameworks
3. Format & Optimize: Convert models using ONNX, PMML, or JSON
4. Deploy & Serve: Use Triton Inference Server or TensorFlow Serving
5. Monitor & Govern: Built-in tooling for observability and compliance
Whether you're starting with a pretrained model or building from scratch, the AI Toolkit enables secure, high-performance AI within existing enterprise workflows.
Get Started Today
Explore, build, and deploy with IBM Z and LinuxONE:
· AI on Z 101 – Framework guides and best practices
· Solution Templates – Guided templates with real-world use cases to jumpstart adoption
· Discovery Workshop – Contact AIonZ@us.ibm.com for a free of charge workshop where our team helps to define a use case and define an MVP proof of concept