Data and AI on Power

 View Only

MLOps with Kubeflow on IBM Power

By Sebastian Lehrig posted Wed August 10, 2022 11:33 AM

  

Combining machine learning with best practices from software engineering and IT Operations (MLOps) is one of the hottest and growing topics in the machine learning field. Kubeflow is an open source toolkit for MLOps and the de-facto standard for MLOps on containerized environments like Red Hat OpenShift and vanilla Kubernetes.

We have now brought Kubeflow to IBM Power – and you can try it out on your own by following this blog!

Kubeflow from a bird’s-eye view

As a toolkit, Kubeflow comes with multiple tools for data scientists as shown in Figure 1. These tools help during the whole life cycle of typical machine learning projects that involve: building models, tuning models, deploying models to production, and managing in-production models. In each of these distinct steps, data scientists are governed with appropriate tools such as notebook-based development environments, Hyperparameter Optimization (HPO), model serving in production environments, and more.

Kubeflow provides several tools for data scientists

Figure 1. Kubeflow provides several tools for data scientists

The most important and prominent tool in Kubeflow are pipelines. Pipelines inter-connect all those tools into a common, integrated workflow. From a technical perspective, pipelines formally specify the machine learning workflow (from data ingestion to model deployment) as an executable artifact. This formalization makes the whole process versionable, reproducible, auditable, and automatable – or, in other works, production-ready at scale

A simple example pipeline

Let’s have a look at the example pipeline illustrated in Figure 2. The pipeline formalizes the steps from getting and preparing data, training a model, evaluating the model, exporting it, pushing it to a production environment, and deploying it there. Each individual step (a so-called component) was formalized by a data scientist either by:

  • Leveraging Jupyter Notebooks (note the orange notebook icon on the left side of each of the first five components); this is where data scientists really feel like home or
  • Reusing pre-existing Kubeflow components (note the Kubeflow icon in the last component and in the panel at the left); this is where we save lots of efforts by simply plugging in what we developed before in other pipelines (or what others have provided – for IBM Power we provide a whole catalog!).
A typical pipeline in Kubeflow

Figure 2. A typical pipeline in Kubeflow

The only need for data scientists is that they must familiarize themselves with these pipelines, so that they can specify pipelines using Python SDKs or the graphical editor used in Figure 2. And that’s really it. Data scientists don’t have to bother with Kubernetes to bring their artifacts in the production (Kubernetes) environment. And, honestly, they shouldn’t. Mastering Kubernetes is neither their goal nor their profession (some might even dislike it).

The beauty of Kubeflow is that everything data scientists do – like running pipelines and serving models – is still fully backed by Kubernetes.

How Kubeflow makes Kubernetes-magic happen

Once specified, data scientists can run a pipeline (for example, by clicking the Play button at the top of the pipeline editor). Now, this is where the magic happens.

Kubeflow takes over full control and automatically packages each of the pipeline’s components into self-contained container images (pssst, the data scientists don’t have to know!). Afterwards, Kubeflow lets the Kubernetes cluster spawn and orchestrate pods from those images according to the pipeline specification – Kubernetes particularly ensures that data is passed from one pod to the other after the former pod is finished with its task. Eventually, the whole pipeline will be executed against the cluster. The data scientist can follow the state of execution graphically in Kubeflow’s UI.

The big advantage is that data scientists can fully leverage Kubernetes without having to know about it. At the same time, IT operations can profile, meter, and manage Kubeflow-created resource uniformly like any other workload they observe in their Kubernetes cluster.

Running Kubeflow on Power

IBM Power servers shine with their efficiency for AI and container workloads, their security, and their reliability. These benefits make Power a perfect deployment target for Kubeflow in production. Moreover, we created a rich ecosystem of reusable Kubeflow artifacts for Power with reusable components, adaptable Jupyter Notebook environments for development, security-hardened container images, and end-to-end examples – just check out my public GitHub repository.

We also made the installation on IBM Power super easy. An installation guide is provided in my repository but the gist is the following: Assuming you have either an OpenShift or a vanilla Kubernetes cluster on Power installed, simply execute these lines from a server node (bastion on OpenShift; master on vanilla Kubernetes) and follow the instructions to get Kubeflow installed:

wget https://raw.githubusercontent.com/lehrig/kubeflow-ppc64le-manifests/main/install_kubeflow.sh

source install_kubeflow.sh

Listing 1. Installing Kubeflow on IBM Power

Moving on with Kubeflow on Power

Now that you have installed Kubeflow, start using it. First, I recommend starting up and connecting to a notebook server (as shown in Figure 3) to get access to an in-cluster development environment.

Connecting to a notebook server for developing pipelines in Jupyter Lab

Figure 3. Connecting to a notebook server for developing pipelines in Jupyter Lab

From here, load one of the examples from my examples repository, execute it, experiment, and have fun!

You have now gained a high-level understanding of Kubeflow on IBM Power. Stay tuned for more blogs on this topic, we’ve got lots of cool new features and success stories to share.

Permalink