Segmenting with pre-trained Mask R-CNN

In this exercise, you will use the pre-trained Mask R-CNN model to perform instance segmentation on the following image of two cats.

two cats image

The model you will use has been pre-trained on the COCO dataset, which contains images of common objects, including animals. Thanks to this, the model should be able to recognize cats out of the box, without the need to fine-tune it.

Your task is to load the model and the two cats image, prepare the image, and pass it to the model to obtain the predictions. Image from PIL, torch, transforms from torchvision, and maskrcnn_resnet50_fpn have been imported for you.

This exercise is part of the course

Deep Learning for Images with PyTorch

View Course

Exercise instructions

Load the pretrained Mask R-CNN to model using maskrcnn_resnet50_fpn().
Transform the two cats image to a tensor and unsqueeze it.
Perform inference by passing the image to the model and assign the output to prediction.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load a pre-trained Mask R-CNN model
model = ____(____)
model.eval()

# Load an image and convert to a tensor
image = Image.open("two_cats.jpg")
transform = transforms.Compose([transforms.ToTensor()])
image_tensor = transform(image).____

# Perform inference
with torch.no_grad():
    prediction = ____
    print(prediction)

Edit and Run Code