AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

AI on IBM Z & IBM LinuxONE

Leverage AI on IBM Z & LinuxONE to enable real-time AI decisions at scale, accelerating your time-to-value, while ensuring trust and compliance

 View Only

IBM Z Deep Learning Compiler 4.3.0

By SUNNY ANAND posted Wed November 27, 2024 04:52 PM

  

Today we are excited to announce the 4.3.0 update for the IBM Z Deep Learning Compiler (IBM zDLC). The icr.io/ibmz/zdlc:4.3.0 image is available now from the IBM Z and LinuxONE Container Image Registry. An updated tutorial is also available at https://github.com/IBM/zDLC. For those interested in enterprise-level support for mission-critical workloads, IBM zDLC is included in the AI Toolkit for IBM Z and IBM LinuxONE.

Changes in this release: 

  • Ops support Updates
    • Updated for ONNX OpSet 21 
      • GroupNormalization 
      • If 
      • Cast 
  • [NNPA]Integrated Accelerator for AI Ops 
    •   Softplus  
  • Updated ONNX Ops 
    • Where
    • Unique
    • Reshape
    • ScatterND
    • Reciprocal
    • Reshape 

New/Updated compiler options:  

  • --enable-timing  for detailed compile time report default =false
  • --disable-krnl-op-fusion disable op fusion in onnx-to-krnl pass default =false 
  • --dimParams dynamic model inputs , default=onnx.dim_params of the model input 
  • --store-constants-to-file Constants will be stored on a binary file instead of being embedded into the model.so
    • --constants-to-file-single-threshold parameters control the constant file size threshold.
    •  --constants-to-file-total-threshold control the constant file size threshold.
  • --zhigh-recompose-to-stick-unstick pass to recompose ops back to zhigh stick and unstick  
  • --opt-report=NNPAUnsupportedOps Generate report on why the operations are not run on NNPA. 
  • --enable-bound-check  Enable runtime bound check for memrefs  default=false  
  • --nnpa-saturation Enable saturating f32 values before stickify them default=false . This option turns enable-compiler-stick-unstick on. 
  • --enable-compiler-stick-unstick Enable the compiler to generate some stick/unstick code. Default is false. 
  • --instrument-signature Specify which high-level operations should print their input type(s) and shape(s) ALL for all available operations NONE for no instrument (default)  
Performance Improvements:  
  • Substitute zdnn calls for stick/unstick late, after most ZLow optimizations are performed 

  • Saturation for compiler-generated stickify 

  • Recompose QLinearMatMul and remove Quantize-Dequantize pairs 

  • Write a constant value to single file without buffering to remove spikes in memory consumption. 

  • Constant Propagation for Unary and Binary Ops. 

Utilities and Tools Added: 
  • Make Plots using -p  in the utils make-report.py  

  • Add a tool to generate a time chart from an instrumentation file using the  "--InstrumentReportTime" option. 

  • Show the compilation phases when using ZDLC 

If you have questions on getting started with AI on IBM Z, reach out to us at aionz@us.ibm.com

0 comments
29 views

Permalink