4Q 2022 Update for IBM Z Deep Learning Compiler
Today we are excited to announce the new image for the IBM Z Deep Learning Compiler (IBM zDLC). The image available on the
IBM Z and LinuxONE Container Image Registry as zdlc:3.2.0. An updated tutorial is available at
https://github.com/IBM/zDLC.
Changes in this release:
- Updated image name and CLI name to zdlc
- ONNX Operators Updates
- Support for new models
- Updated runtime instrumentation options
- Model runtime performance improvements
- Security updates
- Misc bug fixes
As always, if you have questions on getting started with AI on IBM Z, reach out to us at
aionz@us.ibm.com!
Updated image name and CLI name to zdlc
Starting with the 4Q release, we've switched to using "zdlc" as the image name in the IBM Z and LinuxONE Container Image Registry and CLI within the image. As a standalone branding, IBM zDLC will have increased flexibility to pull in fixes and changes ahead of formal ONNX-MLIR releases. The IBM Z Deep Learning compiler will continue to be based on the latest ONNX-MLIR versions available, which for this release is ONNX-MLIR 0.3.2. Additionally with this change, the "onnx-mlir" image will no longer be updated in the IBM Z and LinuxONE Container Image Registry. All IBM zDLC updates going forward will be made to the "zdlc" image.
Along with the image and CLI name change, the
--version
output has been updated to reflect the changes.
IBM Z Deep Learning Compiler 3.2.0-c7f0762
onnx-mlir version 0.3.2, onnx version 1.12.0
...
There's a lot of information available here at a glance. From
3.2.0-c7f0762
, the
3.2.0
is the version of the IBM zDLC itself and the
c7f0762
reflects the exact ONNX-MLIR commit that IBM zDLC was based on. The
onnx-mlir version 0.3.2
reflects version number reported by ONNX-MLIR and
onnx version 1.12.0
is the version of ONNX used by ONNX-MLIR.
ONNX Operators Updates
Add support for both
ArgMin and
If ONNX operators on CPU. The
BatchNormalization and
Shape operators have been updated to support the OpSet 15 versions and
ScatterElements to OpSet 16.
LeakyRelu has been updated to utilize the IBM z16's Integrated Accelerator for AI in specific scenarios. The NNPA implementation will be used when the operation immediately before and after support NNPA. This is to prevent unnecessary conversions between the data formats used by CPU vs NNPA operators.
Support for new models
The latest IBM Z Deep Learning Compiler now supports the following models from the ONNX Model Zoo:
Updated runtime instrumentation options
At compile time,
--instrument-stage
and
--instrument-op
can now be set which print instrumentation information at model runtime. These options allow for more granular control for when instrumention is printed and specific dialects or operations too print information about. See
ONNX-MLIR 0.3.2 Instrumentation documentation for more details.
Model runtime performance improvements
This update improvements to enable more operations to stay on the IBM z16 Integrated Accelerator for AI and avoid costly data format conversions. This includes (but not limited to) better compile time model dimension analysis for models with dynamic batch sizes. Some model specific examples include:
- Resnet50-v1.5 -> 54x inference improvement
- Yolov3-12 -> 1.5x inference improvement
- Bertsquad-12 -> 6.5x inference improvement