Data and AI on Power

 View Only

How to Setup a Rust Project to Leverage MMA Optimizations on IBM Power10 Systems

By Daniel Schenker posted Thu March 07, 2024 03:41 PM


This blog details the steps required to set up a rust project with MMA optimizations on IBM Power10 systems.


This blog assumes the user already has conda installed. Utilize the following blog post by Sebastian Lehrig to get conda setup on power if needed.

Environment Setup

Create a new conda environment.

conda create --name your-env-name-here python=3.11

This will create a new environment and install python version 3.11 and its required dependencies.

Activate the newly created environment.

conda activate your-env-name-here

Once the environment is active, install the required packages.

conda install rust -c rocketce

conda install gcc gfortran -c conda-forge

When using the conda install command with the -c argument, packages will attempt be installed from a specified channel. Packages installed via the rocketce channel will have MMA optimizations.

Project Setup

Create a new rust project.

cargo new your-project-name-here

This will create a new directory with the provided project name. Navigate to the new project directory. Inside this directory will be a Cargo.toml file as well a file inside the src directory.

  • is the main script that will be run and contains a default “Hello World!” program
  • Cargo.toml is the file in which project dependencies and libraries are set

Open Cargo.toml and add the following lines under the [dependencies] section.

blas = "0.22.0"

openblas-src = "0.10.9"

These lines add BLAS functionality to the rust project. They are external packages that can be found at the following links. blas, openblas-src. The Cargo.toml file should look as follows.

name = "rust"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at <>

blas = "0.22.0"
openblas-src = "0.10.9"

Save the file and then open the file. There will be boilerplate code that simply prints “Hello World!” to the console. Replace the existing code with the following.

extern crate openblas_src;
use blas::*;

fn main() {
    let (m, n, k) = (2, 4, 3);
    let a = vec![
        1.0, 4.0,
        2.0, 5.0,
        3.0, 6.0,
    let b = vec![
        1.0, 5.0,  9.0,
        2.0, 6.0, 10.0,
        3.0, 7.0, 11.0,
        4.0, 8.0, 12.0,
    let mut c = vec![
        2.0, 7.0,
        6.0, 2.0,
        0.0, 7.0,
        4.0, 2.0,

    unsafe {
        dgemm(b'N', b'N', m, n, k, 1.0, &a, m, &b, k, 1.0, &mut c, m);

        c == vec![
            40.0,  90.0,
            50.0, 100.0,
            50.0, 120.0,
            60.0, 130.0,

This script contains sample code that carries out a simple matrix multiplication using the dgemm function provided by BLAS. This script was created using the sample given on the BLAS Crate page.

The final step before the project can be built and run is to make sure that the rust build tool has all necessary directories to link required libraries.

Create a new directory inside the base project directory called .cargo.

Create a new file inside this directory called config.toml.

Open config.toml and paste the following lines.

rustflags = "-L /home/<username>/anaconda3/envs/<envname>/lib"

The goal of this operation is to provide the rust build tool with the directory in which the conda environment stores all of its installed libraries. Therefore, the exact path may differ system to system but the path shown is the default installation path for Anaconda on IBM Power10 systems. Be sure to replace <username> and <envname> with the appropriate names.

With this completed, return to the base project directory.


The project can be run with the cargo run command. On first run the project will be built and run so expect slightly longer execution time on the first run.

To check MMA usage run the project with the following command.

perf stat -e r1000E cargo run

This command outputs the number of MMA events that occur during execution. Sample output is as follows.

Sample output displaying MMA utilization

As seen in the above output, there were 11 MMA events that occurred during the execution of this project. This number will change based on the complexity and number of operations being carried out.


This blog detailed the steps required to set up a rust project with MMA optimizations on IBM Power10 systems. A basic matrix multiplication script was created and MMA utilization was confirmed. This script acts as a starting point for the use of optimized matrix operations on IBM Power10 systems and further improvements for more specific use cases can be made.