Data and AI on Power

 View Only

How to Run a ResNet Model on IBM Power10 Using PyTorch

By Daniel Schenker posted Tue January 30, 2024 11:50 AM

  

This blog details the steps required to run inferencing with PyTorch on IBM Power10 systems using a resnet model. A resnet model is a deep neural network architecture designed to combat the vanishing gradient problem, allowing for the effective training of very deep networks. Resnet models are used for various computer vision tasks and this blog will demonstrate its image detection capabilities.

Prerequisites

This blog assumes the user already has conda installed. Utilize the following blog post by Sebastian Lehrig to get conda setup on power if needed.

Environment Setup

Create a new conda environment.

conda create --name your-env-name-here python=3.11

This will create a new environment and install python version 3.11 and its required dependencies.

Activate the newly created environment.

conda activate your-env-name-here

Once the environment is active, install openblas, pytorch, and their dependencies.

conda install libopenblas -c rocketce

conda install pytorch-cpu -c rocketce

When using the conda install command with the -c argument, packages will attempt be installed from a specified channel. Packages installed via the rocketce channel will have MMA optimizations.

Project Setup

Navigate to a desired project directory and download the ImageNet labels.

wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

Create a new python script inside the project directory.

touch resnet.py

Open the python script with any text editor or IDE (vi, vim, nano, vscode, etc…) and paste the following code.

import torch

model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
# or any of these variants
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet34', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet101', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet152', pretrained=True)
model.eval()

# Download an example image from the pytorch website
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

# Sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# Move the input and model to GPU for speed if available
if torch.cuda.is_available():
    input_batch = input_batch.to('cuda')
    model.to('cuda')

with torch.no_grad():
    output = model(input_batch)
# Tensor of shape 1000, with confidence scores over ImageNet's 1000 classes
print(output[0])
# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
probabilities = torch.nn.functional.softmax(output[0], dim=0)
print(probabilities)

# Read the categories
with open("imagenet_classes.txt", "r") as f:
    categories = [s.strip() for s in f.readlines()]
# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
    print(categories[top5_catid[i]], top5_prob[i].item())

This script was put together using the PyTorch ResNet tutorial page and acts as a basic implementation of the pre-trained residual network. The script grabs an image of a dog from the PyTorch GitHub, and attempts to classify it.

Execution

Once the script is complete, run the model and view the results.

python3 resnet.py

The expected output is of the form “Class Label” “Confidence/Probability”

Sample output from resnet.py:

Samoyed 0.8846219182014465

Arctic fox 0.0458051897585392

white wolf 0.04427671432495117

Pomeranian 0.005621395539492369

Great Pyrenees 0.004652050323784351

Improvements

With a basic residual network up and running, the next step is to expand the script’s capabilities to classify images other than a single stock dog image. The existing script was edited to create the following modified script.

import torch
import argparse
import os
import urllib
from PIL import Image
from torchvision import transforms
from torchvision import models

# Classify an image using a pretrained resnet model
def classifyImage(image, layers, debug):
    # Check input parameters
    if image is not None:
        # Ensure that the provided image exists
        if checkImagePath(image):
            inputImage = Image.open(image)
        else:
            print('Image not found. Check the provided path.')
            exit()
    else:
        # Use stock dog image if custom image was not given
        url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
        try: urllib.URLopener().retrieve(url, filename)
        except: urllib.request.urlretrieve(url, filename)
        inputImage = Image.open(filename)

    if layers is None:
        # Use default 18 layer model if layers was not passed
        layers = 18

    # Load the desired model and enable evaulation mode
    model = getModel(layers)
    model.eval()

    # Image preprocessing settings
    preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])

    # Preprocess image
    inputTensor = preprocess(inputImage)

    # Create a mini-batch as expected by the model
    inputBatch = inputTensor.unsqueeze(0)

    # Move the input and model to GPU if available
    if torch.cuda.is_available():
        inputBatch = inputBatch.to('cuda')
        model.to('cuda')

    # Run the model without gradient calculation
    with torch.no_grad():
        output = model(inputBatch)
    
    # Run softmax on output tensor to get probabilities
    probabilities = torch.nn.functional.softmax(output[0], dim=0)
    
    # Print intermediate tensor values
    if debug:
        # Output tensor of shape 1000 with confidence scores over ImageNet's 1000 class labels
        print('Raw Output Tensor:')
        print(output[0])
        # Tensor of shape 1000 with probabilities over ImageNet's 1000 class labels
        print('Probability Tensor:')
        print(probabilities)

    # Read ImageNet's class labels
    with open("imagenet_classes.txt", "r") as f:
        classLabels = [s.strip() for s in f.readlines()]

    # Print top categories per image
    top5Prob, top5CatId = torch.topk(probabilities, 5)
    print('Top 5 results:')
    for i in range(top5Prob.size(0)):
        print(f'{classLabels[top5CatId[i]]} : {top5Prob[i].item()}')

# Ensure that the user provided image path exists
def checkImagePath(imagePath):
    return os.path.exists(imagePath)

# Get a pretrained resnet model with the desired layers
def getModel(layers):
    match layers:
        case 18:
            return models.resnet18(weights='ResNet18_Weights.DEFAULT')
        case 34:
            return models.resnet34(weights='ResNet34_Weights.DEFAULT')
        case 50:
            return models.resnet50(weights='ResNet50_Weights.DEFAULT')
        case 101:
            return models.resnet101(weights='ResNet101_Weights.DEFAULT')
        case 152:
            return models.resnet152(weights='ResNet152_Weights.DEFAULT')
        case default:
            print('Invalid number of layers, defaulting to resnet18')
            return models.resnet18(weights='ResNet18_Weights.DEFAULT')

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--image', help='Absolute path to image', required=False)
    parser.add_argument('-l', '--layers', help='Number of layers in model', required=False, choices=[18, 34, 50, 101, 152], type=int)
    parser.add_argument('-d', '--debug', help='Enable debug mode', required=False, action='store_true')
    args = parser.parse_args()

    classifyImage(args.image, args.layers, args.debug) 

This script works just like the original script, but has the option to pass in script parameters to specify an image to classify, the number of layers to use in the model, and debug mode. The optional script parameters work as follows.

- use -i/--image followed by an absolute path to an image to attempt to classify that image instead of the stock dog image

- use -l/--layers to specify the number of layers in the pre-trained model, choices are [18, 34, 50, 101, 152], a default of 18 will be used if this parameter is not supplied

- use -d/--debug to enable debug mode, this will print out intermediate results such as the raw and probability tensors

Each of these parameters are optional. An example invocation can be seen as follows.

python3 resnet.py -i /abs/path/to/image -l 18 -d

Conclusion

This blog detailed the steps required to run inferencing with PyTorch on IBM Power10 systems using a resnet model. The basic implementation of the model was improved upon for better usability. The next steps are to further improve upon the script and tailor it to more specific use cases and needs.

Permalink