Enterprise Linux on Power delivers the foundation for your open source hybrid cloud infrastructure with industry-leading cloud-native deployment options.
The IBM® Power® AC922 server that IBM announced in December 2017 offers a faster way to deploy data-intensive, deep learning workloads, and accelerated databases that are optimized for artificial intelligence (AI) and high-performance computing (HPC). The Power AC922 server is designed to use the capabilities of the NVIDIA Volta graphics processing unit (GPU) accelerators by using NVLink 2.0.
This document describes how you can isolate NVIDIA Volta GPUs on a POWER9 processor-based server by using nvidia-docker, which is an NVIDIA project.
This document describes the preparatory steps for the system to run nvidia-docker, outlines how to isolate NVIDIA Volta GPUs inside Docker containers, describes how to run NVIDIA CUDA 9.1 applications inside containers, and also illustrates how to run popular deep learning frameworks on a Volta GPU inside Docker containers.
Although this document showcases Red Hat Enterprise Linux (RHEL) 7.4 for POWER9 processor-based systems, the same workflows can be used on Ubuntu 16.04 LTS.
This section provides a high-level flow on setting up the system to run nvidia-docker:
Installing the operating system (OS)
Prepare the system by applying the right firmware.
Install RHEL 7.4 for Power, little endian in the IBM Power AC922 server with DD2.1, and Volta GPUs.
Setting up the CUDA 9.1 toolkit (optional)
Download and install CUDA Toolkit 9.1.
Perform Volta GPU device discovery.
Setting up NVIDIA and Docker, and the nvidia-docker project
Set up the NVIDIA driver.
Install the Docker engine.
Set up nvidia-docker.
Exploring CUDA images with nvidia-docker.
Pull the CUDA-9.1 Docker images from NVIDIA GitHub.
Build deviceQuery from inside the container.
Running popular deep learning frameworks, such as TensorFlow, Caffe, and Torch.
Caffe, and Torch.