How to return image with bounding box using AWS Rekognition?-CodePudding

When I upload image to s3 bucket and call AWS Rekognition detect_labels I am getting dictionary of detected labels like below

{'Labels': [{'Name': 'Plant', 'Confidence': 99.70314025878906, 'Instances': [], 'Parents': []}, {'Name': 'Flower', 'Confidence': 98.37027740478516, 'Instances': [], 'Parents': [{'Name': 'Plant'}]}

but here I need to return the image with bounding box where the object is identified, how this can be achieved ?

CodePudding user response：

You have to use rekognition.detect_labels method to perform object detection on images. You can then use the BoundingBox property of Labels list to retrieve get the coordinates of the bounding box, the code below can be a good start.

import boto3
import io
from PIL import Image, ImageDraw, ImageFont

file_name = 'plant.jpg'
# Get Rekognition client
rek_client = boto3.client('rekognition')
with open(file_name, 'rb') as im:
    # Read image bytes
    im_bytes = im.read()
    # Upload image to AWS 
    response = rek_client.detect_labels(Image={'Bytes': im_bytes})
    # Get default font to draw texts
    image = Image.open(io.BytesIO(im_bytes))
    font = ImageFont.truetype('arial.ttf', size=80)
    draw = ImageDraw.Draw(image)
    # Get all labels
    w, h = image.size
    for label in response['Labels']:
        name = label['Name']
        # Draw all instancex box, if any
        for instance in label['Instances']:
            bbox = instance['BoundingBox']
            x0 = int(bbox['Left'] * w) 
            y0 = int(bbox['Top'] * h)
            x1 = x0   int(bbox['Width'] * w)
            y1 = y0   int(bbox['Height'] * h)
            draw.rectangle([x0, y0, x1, y1], outline=(255, 0, 0), width=10)
            draw.text((x0, y1), name, font=font, fill=(255, 0, 0))

    image.save('labels.jpg')

CodePudding user response：

From DetectLabels API documentation:

DetectLabels returns bounding boxes for instances of common object labels in an array of Instance objects. An Instance object contains a BoundingBox object, for the location of the label on the image. It also includes the confidence by which the bounding box was detected.

This is elaborated more in Detecting labels documentation

Amazon Rekognition Image and Amazon Rekognition Video can return the bounding box for common object labels such as cars, furniture, apparel or pets. Bounding box information isn't returned for less common object labels. You can use bounding boxes to find the exact locations of objects in an image, count instances of detected objects, or to measure an object's size using bounding box dimensions.

In short, bounding box information isn't returned for all labels. @Allan Chua's code will draw bounding boxes in the image only if the label is a 'common object' which has bounding box information. In the sample API response you provided, none of the labels have bounding box information.