Co-authored with Hidematsu Sueki
At IBM, we’re dedicated to offering state-of-the-art technology to organizations. That’s why we’re excited to expand our GX3 family with another GPU accelerated by the NVIDIA Hopper architecture.
IBM is thrilled to announce that the NVIDIA H100 Tensor Core GPU is generally available for IBM Cloud Kubernetes Service (IKS) and Red Hat OpenShift on IBM Cloud (ROKS) clusters running on IBM Cloud VPC.
As artificial intelligence (AI) and machine learning (ML) models continue to grow and meet business requirements worldwide, so, too, do the requirements for training and delivering those models. The NVIDIA H100 GPU is readily equipped to meet that demand, accelerating the most demanding AI workloads and delivering unprecedented performance and efficiency. They inherit many design principles from NVIDIA A100 Tensor Core GPUs, with a focus on improved architectural efficiency and scaling. Designed for massive scale, the H100 enables organizations to train and deploy the largest and most complex AI models, while boasting incredible performance. NVIDIA H100 GPUs are up to 6x faster chip-to-chip compared to A100. We’ve found that switching from A100 GPUs to H100 GPUs experience up to 30x speed improvements and up to 9x speed improvements in AI inferencing and AI training, respectively.
Available GX3D (NVIDIA H100 GPU) flavors
The following H100 GPU flavor is available for IBM Cloud VPC clusters that run on any version of Red Hat OpenShift for both RHEL and RHCOS operating systems.
-
gx3d.160x1792.8h100: 8 GPU, 160 cores, 1.8 TB memory, 100GB primary storage, 8 7.7TB additional storage, 32 Gbps network speed
Getting started with GX3D (NVIDIA H100 GPUs) on IBM Cloud Kubernetes Service
Enjoy a plug-and-play experience with IBM Cloud Kubernetes Service when provisioning a cluster. GPU drivers are automatically installed, and you can get started immediately by provisioning a new cluster at 1.31 or later with GX3D worker nodes. No additional configuration is required to set up the GPU. If you already have a 1.31+ cluster, simply add a worker pool that uses the GX3D nodes to your existing cluster. For more information, see Deploying an app on a GPU machine for IBM Cloud Kubernetes Service.
Getting started with GX3D (NVIDIA H100 GPU) on Red Hat OpenShift on IBM Cloud
With Red Hat OpenShift on IBM Cloud, installing the NVIDIA GPU Operator automates the management of all the necessary NVIDIA software components. Once complete, provision a new cluster at 4.15 or later with the GX3D worker nodes. If you already have a 4.15+ cluster, simply add a worker pool that uses the GX3D nodes to your existing cluster. For more information, see Deploying an app on a GPU machine for Red Hat OpenShift on IBM Cloud.
#Highlights
#Highlights-home