Object detection
In this exercise, you will use the same flickr dataset as previously, which has 30,000 images and associated captions. Now you will find bounding boxes of objects detected by the model.

The sample image (image) and pipeline module (pipeline) have been loaded.
Deze oefening maakt deel uit van de cursus
Multi-Modal Models with Hugging Face
Oefeninstructies
- Load the
object-detectionpipeline withfacebook/detr-resnet-50pretrained model. - Find the
labelof the detected object. - Find the associated confidence
scoreof the detected object. - Find the bounding
boxcoordinates of the detected object.
Praktische interactieve oefening
Probeer deze oefening eens door deze voorbeeldcode in te vullen.
# Load the object-detection pipeline
pipe = pipeline("____", "____", revision="no_timm")
pred = pipe(image)
outputs = pipe(image)
for n, obj in enumerate(outputs):
# Find the detected label
label = ____
# Find the confidence score of the prediction
confidence = ____
# Obtain the bounding box coordinates
box = ____
plot_args = {"linewidth": 1, "edgecolor": colors[n], "facecolor": 'none'}
rect = patches.Rectangle((box['xmin'], box['ymin']), box['xmax']-box['xmin'], box['ymax']-box['ymin'], **plot_args)
ax.add_patch(rect)
print(f"Detected {label} with confidence {confidence:.2f} at ({box['xmin']}, {box['ymin']}) to ({box['xmax']}, {box['ymax']})")
plt.show()