Data and AI on Power

 View Only

How to Run Batch Inferencing with PyTorch on IBM Power10 Using a ResNet Model

By Daniel Schenker posted Thu February 22, 2024 10:13 AM

  

This blog is a further improvement on the previous PyTorch ResNet blog. This blog details the steps required to run batch inferencing with PyTorch on IBM Power10 systems using a resnet model.

Prerequisites

This blog assumes the user already has conda installed. Utilize the following blog post by Sebastian Lehrig to get conda setup on power if needed.

Environment Setup

Create a new conda environment.

conda create --name your-env-name-here python=3.11

This will create a new environment and install python version 3.11 and its required dependencies.

Activate the newly created environment.

conda activate your-env-name-here

Once the environment is active, install openblas, pytorch, and their dependencies.

conda install libopenblas -c rocketce

conda install pytorch-cpu -c rocketce

When using the conda install command with the -c argument, packages will attempt be installed from a specified channel. Packages installed via the rocketce channel will have MMA optimizations.

Project Setup

Navigate to a desired project directory and download the ImageNet labels.

wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt

Create a new python script inside the project directory.

touch resnet_batch.py

Open the python script with any text editor or IDE (vi, vim, nano, vscode, etc…) and paste the following code.

import torch
import argparse
import os
import time
from PIL import Image
from torchvision import transforms
from torchvision import models
from torchvision import datasets

# Classify a batch of images using a pretrained resnet model
def classifyImage(image_path, batch_size):
    # Ensure that the provided image path exists
    if os.path.exists(image_path) == False:
        print('Image path not found. Check the provided path.')
        exit()

    # Set default batch size if not provided
    if batch_size is None:
        batch_size = 1

    # Load the desired model and enable evaulation mode
    model = models.resnet18(weights='ResNet18_Weights.DEFAULT')
    model.eval()

    # Image preprocessing settings
    data_transforms = {
    'predict': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
    }

    # Organize data for batch processing
    dataset = {'predict' : datasets.ImageFolder(image_path, data_transforms['predict'])}
    dataloader = {'predict': torch.utils.data.DataLoader(dataset['predict'], batch_size=int(batch_size), shuffle=False, num_workers=5)}

    # Read ImageNet's class labels
    with open("imagenet_classes.txt", "r") as f:
        classLabels = [s.strip() for s in f.readlines()]

    # Run model and extract label
    for inputs, labels in dataloader['predict']:
        with torch.no_grad():
            output = model(inputs)
        # Classification output (batch_size must be a multiple of the total number of images)
        for i in range(int(batch_size)):
            index = output[i].data.numpy().argmax()
            print(classLabels[index])

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('-i', '--image', help='Path to image- directory', required=True)
    parser.add_argument('-b', '--batch_size', help='Batch size: Number of images to process at a time', required=False)
    args = parser.parse_args()
    classifyImage(args.image, args.batch_size)

This script utilizes command line arguments to specify the location of images to classify and the desired batch size. The script parameters work as follows.

  • -i/--image is a required parameter and should be followed by the path to a directory containing the images to classify.
    • Note: The PyTorch DataLoader requires that image folders be of a specific format. The format is /image_root_dir/image_sub_dir/(image files here)
    • Example image directory structure /image_root/images/*.jpg
  • -b/--batch_size is an optional parameter that can be followed by an integer representing the batch size. The batch size is the number of images that will be processed by the model at once.
    • For example if the image folder contains 50 image files and a batch size of 10 is provided, the model will classify all 50 images in 5 passes of 10 images each.
    • Note: The batch size must be a multiple of the total number of images being classified.
    • Note: If this parameter is not provided a default batch size of 1 will be used.

Execution

Once the script is complete, run the model and view the results.

python3 resnet_batch.py -i ./image_root/ -b 5

The script will output the classification for each image based on the ImageNet labels.

Conclusion

This blog detailed the steps required to run batch inferencing with PyTorch on IBM Power10 systems using a resnet model. This blog further improved upon the previous PyTorch resnet blog by implementing batch image processing to increase the overall efficiency of the script.

Permalink