ISV Ecosystem - Group home

z16 Overview

  

Overview

The latest generation of IBM zSystems, dubbed z16, contains many technological improvements. Featuring the Telum central processor chip, overviewed by IBM’s Christian Jacobi and Elpida Tzortzatos, the z16 incorporates multiple advances that will increase performance and reduce transaction response time significantly.

The machine type for the z16 is 3931 and it comes in 2 models - A01 is the base model and LA1 is the LinuxONE model. Each model has multiple processor and memory capacities which are represented by feature codes. For example, for the A01 the Max39 feature code allows up to 39 CPs. There are 4 other feature codes ending with the Max200 with maximum CP counts of 82, 125, 168, and 200. The maximum memory starts at 10TB for the Max39 and goes up 10TB for each feature code bump. Each feature code also has a maximum number of zIIPs, IFLs, ICFs, and SAPs. There are full and sub-capacity CP offerings. All specialty engines are full capacity and the zIIP to CP ratio is 2:1, as long as there are enough cores available. There are MES concurrent upgrade paths from the z14 M01-M04 and z15 T01, to the z16 Max39, Max82 and Max125. The really big iron, the Max168 and Max200 comes from the factory.

Outwardly, the z16 hardware configuration is very close to the z15: the number of Frames, Power Options, Crypto Domains, Crypto Co-processors, and I/O cards remains the same. The maximum number of LPARs is unchanged at 85, but the maximum memory per LPAR doubles to 32 TB. There are some incremental increases, such as the maximum number of CPs (from 190 to 200), and there are new I/O features such as Coupling Express2, RoCE Express3, Crypto Express8S, OSA Express7S, and FICON Express32S. There are also many more carry forward I/O features. There is 25% more processor capacity per drawer over the z15. All in all, the z16 appears at first glance to be a solid but typical upgrade from the z15, until you look under the z16’s hood, with its standout improvements in cache, memory, AI, and Crypto.

Cache

The Telum processor speed is 5.2 GHz. Besides shrinking fab technology from 14 to 7 nm, with all the usual benefits that result from transistor miniaturization (reduced power consumption per transistor, increased speed due to shorter distances between transistors), Telum introduces a novel chip packaging and caching architecture that will reduce cache latency by sharing (larger than z15) L2 cache in virtual L3 and L4 cache.

The organization of the Telum is as follows. The basic processor engine is called the core. There are 8 cores per Telum chip. There are 2 chips in a DCM, and 4 DCMs in CPC drawer. Finally, there are 1-4 drawers per machine. Each core has on-core private L1 cache – 128K instruction and 128K data. Each core also has access to on-core private 32 MB L2 cache, of which 16MB can be used as virtual L2 cache by other cores on the chip depending on the current activity. The L2 cache of an inactive core becomes shared virtual L3 cache by the active cores of the chip, and the L2 cache of an inactive core of another chip can become virtual L4 cache.

Memory

Telum increases the already solid IBM Z and IBM LinuxONE availability via a redesigned memory interface capable of tolerating complete channel or DIMM failures and which can transparently recover data without impact to response time.

AI

Every Telum chip (8 per drawer, 32 per machine) has an integrated on-chip AI accelerator which is shared by the chip’s 8 cores. Each AI accelerator has 1024 simple compute engines and 256 complex function engines with more than 6 TFLOPs compute capacity for a maximum of 200 TFLOPS per system. The AI accelerator features a new Neural Network Processing Assist instruction that operates directly on tensor data, and performs Matrix Multiplication, Convolution, Pooling, and Activation Functions. The resulting consistently low AI latency can significantly enhance trade and other transactions by expediting fraud detection in real time during the transaction. It will accelerate machine learning and deep learning in tools such as TensorFlow, zDNN and ONNX/DLC, open source data science packages, and IBM’s AI offerings such as Db2 Analytics Accelerator, Open Data Analytics for z/OS, and Watson Machine Learning for z/OS, among many others.

Crypto

Telum encrypts main memory for increased security in trusted execution environments, making it an excellent choice for handling sensitive data in hybrid cloud architectures. The z16 will be positioned for Quantum Safe Computing via the new Crypto Express 8S co-processor built from the 4770 Hardware Security Module (HSM) which provides a Quantum-Safe Root of Trust, and APIs to modernize existing applications as well as build new ones leveraging quantum safe cryptographic algorithms. The Crypto Express 8S has 3 configurable modes: Accelerator, CCA and EP11. EP11 mode enhancements include Quantum-safe algorithms in hybrid cryptography for secure channel negotiation between the Crypto Express 8S and the CPACF and TKE.

CPACF (Central Processor Assist for Cryptographic Function – enablement feature code 3863) is available at no charge. CPACF provides hardware accelerated encryption on every core providing faster encryption and decryption than previous servers. The z16 also introduces a new hardware managed counter to track crypto usage, including algorithms, bit lengths and key security. The information is written to a new triplet section in the SMF Type 30 record, when enabled by new IEASYSxx parameter CRYPCTRS.

Conclusion

The z16 is a worthy addition to the IBM zSystems family with many large technology improvements and new features. It enables large speed increases in the types of transaction workloads that are only getting more important by the day (AI, Crypto). Capgemini looks forward to exploiting the exciting new technologies of the z16 in future enhancements to IBM Connect:Direct for z/OS.