This blog details the steps required to run inferencing with PyTorch on IBM Power10 systems using a resnet model. A resnet model is a deep neural network architecture designed to combat the vanishing gradient problem, allowing for the effective training of very deep networks. Resnet models are used for various computer vision tasks and this blog will demonstrate its image detection capabilities.
Prerequisites
This blog assumes the user already has conda installed. Utilize the following blog post by Sebastian Lehrig to get conda setup on power if needed.
Environment Setup
Create a new conda environment.
conda create --name your-env-name-here python=3.11
This will create a new environment and install python version 3.11 and its required dependencies.
Activate the newly created environment.
conda activate your-env-name-here
Once the environment is active, install openblas, pytorch, and their dependencies.
conda install libopenblas -c rocketce
conda install pytorch-cpu -c rocketce
When using the conda install command with the -c argument, packages will attempt be installed from a specified channel. Packages installed via the rocketce channel will have MMA optimizations.
Project Setup
Navigate to a desired project directory and download the ImageNet labels.
wget https://raw.githubusercontent.com/pytorch/hub/master/imagenet_classes.txt
Create a new python script inside the project directory.
touch resnet.py
Open the python script with any text editor or IDE (vi, vim, nano, vscode, etc…) and paste the following code.
import torch
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
# or any of these variants
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet34', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet50', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet101', pretrained=True)
# model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet152', pretrained=True)
model.eval()
# Download an example image from the pytorch website
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
# Sample execution (requires torchvision)
from PIL import Image
from torchvision import transforms
input_image = Image.open(filename)
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
# Move the input and model to GPU for speed if available
if torch.cuda.is_available():
input_batch = input_batch.to('cuda')
model.to('cuda')
with torch.no_grad():
output = model(input_batch)
# Tensor of shape 1000, with confidence scores over ImageNet's 1000 classes
print(output[0])
# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
probabilities = torch.nn.functional.softmax(output[0], dim=0)
print(probabilities)
# Read the categories
with open("imagenet_classes.txt", "r") as f:
categories = [s.strip() for s in f.readlines()]
# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
print(categories[top5_catid[i]], top5_prob[i].item())
This script was put together using the PyTorch ResNet tutorial page and acts as a basic implementation of the pre-trained residual network. The script grabs an image of a dog from the PyTorch GitHub, and attempts to classify it.
Execution
Once the script is complete, run the model and view the results.
python3 resnet.py
The expected output is of the form “Class Label” “Confidence/Probability”
Sample output from resnet.py
:
Samoyed 0.8846219182014465
Arctic fox 0.0458051897585392
white wolf 0.04427671432495117
Pomeranian 0.005621395539492369
Great Pyrenees 0.004652050323784351
Improvements
With a basic residual network up and running, the next step is to expand the script’s capabilities to classify images other than a single stock dog image. The existing script was edited to create the following modified script.
import torch
import argparse
import os
import urllib
from PIL import Image
from torchvision import transforms
from torchvision import models
# Classify an image using a pretrained resnet model
def classifyImage(image, layers, debug):
# Check input parameters
if image is not None:
# Ensure that the provided image exists
if checkImagePath(image):
inputImage = Image.open(image)
else:
print('Image not found. Check the provided path.')
exit()
else:
# Use stock dog image if custom image was not given
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)
inputImage = Image.open(filename)
if layers is None:
# Use default 18 layer model if layers was not passed
layers = 18
# Load the desired model and enable evaulation mode
model = getModel(layers)
model.eval()
# Image preprocessing settings
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Preprocess image
inputTensor = preprocess(inputImage)
# Create a mini-batch as expected by the model
inputBatch = inputTensor.unsqueeze(0)
# Move the input and model to GPU if available
if torch.cuda.is_available():
inputBatch = inputBatch.to('cuda')
model.to('cuda')
# Run the model without gradient calculation
with torch.no_grad():
output = model(inputBatch)
# Run softmax on output tensor to get probabilities
probabilities = torch.nn.functional.softmax(output[0], dim=0)
# Print intermediate tensor values
if debug:
# Output tensor of shape 1000 with confidence scores over ImageNet's 1000 class labels
print('Raw Output Tensor:')
print(output[0])
# Tensor of shape 1000 with probabilities over ImageNet's 1000 class labels
print('Probability Tensor:')
print(probabilities)
# Read ImageNet's class labels
with open("imagenet_classes.txt", "r") as f:
classLabels = [s.strip() for s in f.readlines()]
# Print top categories per image
top5Prob, top5CatId = torch.topk(probabilities, 5)
print('Top 5 results:')
for i in range(top5Prob.size(0)):
print(f'{classLabels[top5CatId[i]]} : {top5Prob[i].item()}')
# Ensure that the user provided image path exists
def checkImagePath(imagePath):
return os.path.exists(imagePath)
# Get a pretrained resnet model with the desired layers
def getModel(layers):
match layers:
case 18:
return models.resnet18(weights='ResNet18_Weights.DEFAULT')
case 34:
return models.resnet34(weights='ResNet34_Weights.DEFAULT')
case 50:
return models.resnet50(weights='ResNet50_Weights.DEFAULT')
case 101:
return models.resnet101(weights='ResNet101_Weights.DEFAULT')
case 152:
return models.resnet152(weights='ResNet152_Weights.DEFAULT')
case default:
print('Invalid number of layers, defaulting to resnet18')
return models.resnet18(weights='ResNet18_Weights.DEFAULT')
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--image', help='Absolute path to image', required=False)
parser.add_argument('-l', '--layers', help='Number of layers in model', required=False, choices=[18, 34, 50, 101, 152], type=int)
parser.add_argument('-d', '--debug', help='Enable debug mode', required=False, action='store_true')
args = parser.parse_args()
classifyImage(args.image, args.layers, args.debug)
This script works just like the original script, but has the option to pass in script parameters to specify an image to classify, the number of layers to use in the model, and debug mode. The optional script parameters work as follows.
- use -i/--image followed by an absolute path to an image to attempt to classify that image instead of the stock dog image
- use -l/--layers to specify the number of layers in the pre-trained model, choices are [18, 34, 50, 101, 152], a default of 18 will be used if this parameter is not supplied
- use -d/--debug to enable debug mode, this will print out intermediate results such as the raw and probability tensors
Each of these parameters are optional. An example invocation can be seen as follows.
python3 resnet.py -i /abs/path/to/image -l 18 -d
Conclusion
This blog detailed the steps required to run inferencing with PyTorch on IBM Power10 systems using a resnet model. The basic implementation of the model was improved upon for better usability. The next steps are to further improve upon the script and tailor it to more specific use cases and needs.