AI on IBM Z & IBM LinuxONE - Group home

IBM Z Deep Learning Compiler 4.1.0

  

IBM Z Deep Learning Compiler 4.1.0

Today we are excited to announce the 4.1.0 update for the IBM Z Deep Learning Compiler (IBM zDLC). The icr.io/ibmz/zdlc:4.1.0 image is available now from the IBM Z and LinuxONE Container Image Registry. An updated tutorial is also available at https://github.com/IBM/zDLC. Finally, for those interested for enterprise level support for mission critical workloads, IBM zDLC will be included in the upcoming AI Toolkit for IBM Z and IBM LinuxONE.

Changes in this release: 

  • New compile time options
    • Enable loading compiled models with gigabytes of constants
    • Support applications loading multiple compiled models
    • Simplify printing of runtime instrumentation
  • Updated ONNX-MLIR, ONNX and ONNX Operator support
  • Bug fixes and Performance Improvements

New Compile Time Options

--store-constants-to-file

 Save model constants to a separate <model>.constants.bin instead of embedding in the <model>.so file. This enables client applications to load model.so files when models have gigabytes of constants. See --help for details and additional controls for this feature.

--tag=<mytag>

Specify a tag to append a tag to internal llvm symbols model functions.

Previously a client application could only load a single model because all models use the same "run_main_graph" entry point. This meant only the last loaded model was callable. Now, when a tag is specified at compile time, functions, internal constants and other such symbols will have "_<mytag>" appended so they can be uniquely referenced in client applications. For example "run_main_graph_encoder" vs "run_main_graph_decoder".

Note, for backward compatibly, each compiled models will continue to also have an untagged "run_main_graph".

--profile-ir=<value>

When this is set a compile time, then when the model is called, runtime information about each operation will be printed. Values can be "Onnx" to print onnx operators or "ZHigh" to print operators when compiling for z16's integrated AI accelerator. The following is an example of model runtime instrumentation.

#  0) before onnx.Constant Time elapsed: 1691688479.493696 accumulated: 1691688479.493696
#  1) after  onnx.Constant Time elapsed: 0.000005 accumulated: 1691688479.493701
#  2) before onnx.Constant Time elapsed: 0.000004 accumulated: 1691688479.493705
#  3) after  onnx.Constant Time elapsed: 0.000004 accumulated: 1691688479.493709

Updated to the latest ONNX, ONNX-MLIR and zDNN

The IBM zDLC is based on on cutting edge open source technology. These updates enable the many features in this release.  In zDLC 4.0.0 we've updated to include the latest released versions of the following:

ONNX Operators Updates

  • CPU
    • New
      • Trilu

      • Unique

    • Enhanced:

      • Unidirectional GRU supports sequence_lens input

    • Updated for OpSet 19
      • Cast

      • Constant

      • Equal

  • Integrated Accelerator for AI Ops

    • New
      • ConvTranspose

    • Updated for OpSet 18

      • ReduceMean

If you have questions on getting started with AI on IBM Z, reach out to us at aionz@us.ibm.com