How to Count Objects Using AWS SageMaker Object Detection

How to Count Objects Using AWS SageMaker Object Detection

Takahiro Iwasa
Takahiro Iwasa
7 min read
Ground Truth Object Detection SageMaker

This example guides you through:

  • Labeling images using Ground Truth
  • Training and deploying your models
  • Performing inference
ℹ️ Note

Images in this post are illustrative and not associated with specific customer projects.

In this note, a SageMaker inference endpoint is used for testing purposes.

Labeling with Ground Truth

Creating Labeling Workforce

To start labeling, you first need to set up your labeling workforce. In this note, a private workforce is created. Team members can authenticate using Cognito or OIDC.

Once the workforce is created, an invitation email is sent to the workers. This email includes the URL for the labeling portal.

You can also retrieve the labeling portal URL by navigating to Private workforce summary > Labeling portal sign-in URL in the SageMaker management console.

Workers must follow the instructions in the invitation email to sign up and access the labeling portal.

An example of the invitation email is as follows:

Hi,
You are invited by jane.doe@xyz.com from <COMPANY> to work on a labeling project.
Click on the link below to log into your labeling project.
"https://<LABELING_PORTAL_URL>"
You will need the following username and temporary password provided below to login for the first time.
User name: <USER_NAME>
Temporary password: <PASSWORD>
Once you log in with your temporary password, you will be required to create a new password for your account.
After creating a new password, you can log into your private team to access your labeling project.
If you have any questions, please contact us at jane.doe@xyz.com.

After accessing the URL, workers must enter the username and temporary password from the invitation email.

They will then be prompted to change their temporary password to a new one.

Upon successful login, workers are redirected to the top page of the labeling portal. Any assigned labeling jobs will appear on this page.

Creating Labeling Job

Navigate back to the SageMaker management console and start creating a new labeling job. Fill in the necessary fields as shown in the image below. Make sure to click Complete data setup to finalize the process.

💡 Tip

After creating the labeling job, it cannot be deleted. Use a unique value, such as one generated with the following command: uuidgen | tr "[:upper:]" "[:lower:]".

For complex labeling tasks, consider specifying a longer value for the Task timeout parameter.

Starting Labeling

After signing in to the labeling portal, the labeling job you just created should appear. Click the Start working button to begin.

ℹ️ Note

It may take some time for the labeling job to appear in the list.

Follow the job instructions to label the dataset. Below is an example of a labeled dataset.

Once all workers have completed their tasks, stop the labeling job.

Checking Labeling Output

After the labeling job is stopped, the final output will be available in the specified S3 bucket. For object detection tasks, the file manifests/output/output.manifest is important. Refer to the official documentation for more details.

Terminal window
annotation-tool/
annotations/
consolidated-annotation/
worker-response/
manifests/
intermediate/
output/
output.manifest
temp/

Ground Truth produces labeling results in Augmented Manifest format. For additional details, check the official documentation.

Training with SageMaker

Once the labeling process is complete, proceed to train your model using the SageMaker console. Configure the training job as follows:

  • Job settings
  • Job name: Use a unique value (e.g., uuidgen | tr "[:upper:]" "[:lower:]").
  • Algorithm source: SageMaker built-in algorithm
  • Choose an algorithm:
  • Algorithm: Vision - Object Detection (MXNet)
  • Input mode: Pipe
  • Resource configuration
  • Instance type: Use a GPU instance like ml.p2.xlarge.
  • Only GPU instances support SageMaker object detection algorithms.
  • Hyperparameters
  • num_classes: Set to the number of object classes (e.g., 1 in this post).
  • num_training_samples: Equal to the number of lines in the manifest file.
  • Input data configuration
  • Training channel
  • Channel name: train
  • Input mode: Pipe
  • Content type: application/x-recordio
  • Record wrapper: RecordIO
  • Data source: S3 (Augmented Manifest File)
  • Attribute names: Include attributes like source-ref and bounding box data keys.
  • S3 location: Specify the S3 URI for the training data manifest file.
  • Validation channel
  • Channel name: validation
  • Output data configuration
  • S3 location: Specify the S3 URI for storing model artifacts.

By utilizing the Augmented Manifest format, you can use Pipe input mode and the RecordIO wrapper type without the need to create additional RecordIO files. Refer to the official documentation for more details:

Inference

Creating Model from Training Job

To create a model from the completed training job, click Create model in the SageMaker console.

Deploying the Model

After creating the model, deploy it by clicking Create endpoint. For cost-efficient and infrequent usage, consider using a serverless endpoint.

Making Requests

Locate the SageMaker inference endpoint on the endpoint detail page. This endpoint can be accessed using tools like curl, Postman, or custom applications.

🔥 Caution

Directly using SageMaker inference endpoints for production workloads is not recommended. This example demonstrates endpoint usage for testing purposes only.

Example: Postman Configuration

Use the following parameters for authentication with AWS Signature V4:

  • AccessKey
  • SecretKey
  • Session Token: Use a temporary credential instead of a permanent one.
  • AWS Region: The region of your SageMaker endpoint.
  • Service Name: sagemaker

Set the Accept: application/json header in the request.

Since the trained model expects binary image input, ensure your image is provided in binary format.

Example: Using AWS SDK (boto3)

You can also perform inference programmatically using the boto3 invoke_endpoint API. Below is an example script:

import json
import boto3
# Initialize SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')
# Define endpoint and input details
endpoint_name = '<YOUR_ENDPOINT_NAME>'
content_type = 'application/x-image'
payload = None
# Read the image file in binary mode
with open('/path/to/image.jpg', 'rb') as f:
payload = f.read()
# Invoke the endpoint
response = runtime.invoke_endpoint(
EndpointName=endpoint_name,
ContentType=content_type,
Body=payload
)
# Parse and display the response
body = response['Body'].read()
predictions = json.loads(body.decode())
print(json.dumps(predictions, indent=2))
# Save the response to a file
with open('./response.json', 'w') as f:
json.dump(predictions, f, indent=2)

Checking Response

The response is returned in JSON format, containing:

  • Class Label IDs
  • Confidence Scores
  • Bounding Box Coordinates
ℹ️ Note

The bounding box coordinates are relative to the actual image dimensions. For more details, refer to the official documentation.

{
"prediction": [
[
0.0,
0.9953756332397461,
0.3821756839752197,
0.007661208510398865,
0.525381863117218,
0.19436971843242645
],
[
0.0,
0.9928023219108582,
0.3435703217983246,
0.23781903088092804,
0.5533013343811035,
0.6385164260864258
],
[
0.0,
0.9911478757858276,
0.15510153770446777,
...
0.9990172982215881
]
]
}

Visualizing Response

To interpret the results of the inference visually, you can use Jupyter Notebook along with matplotlib.

The following Python script demonstrates how to overlay bounding boxes and annotations on the input image.

import json
import matplotlib.patches as patches
import matplotlib.pyplot as plt
from PIL import Image
# Configure plot
plt.figure()
axes = plt.axes()
# Read an image
im = Image.open('/path/to/image.jpg')
# Display the image
plt.imshow(im)
# Read SageMaker inference predictions
with open('response.json') as f:
predictions = json.loads(f.read())['prediction']
# Set initial count
count = 0
# Create rectangles
for prediction in predictions:
score = prediction[1]
if score < 0.2:
continue
# Count up
count += 1
x = prediction[2] * im.width
y = prediction[3] * im.height
width = prediction[4] * im.width - x
height = prediction[5] * im.height - y
rect = patches.Rectangle((x, y), width, height, linewidth=1, edgecolor='r', facecolor='none')
axes.annotate(count, (x + width / 2, y + height / 2), color='yellow', weight='bold', fontsize=18, ha='center', va='center')
axes.add_patch(rect)
# Display the rectangles
plt.show()

The script reads the prediction response in JSON format, extracts the bounding box coordinates, and draws rectangles around the detected objects. Each bounding box is annotated with its corresponding object count.

Takahiro Iwasa

Takahiro Iwasa

Software Developer
Involved in the requirements definition, design, and development of cloud-native applications using AWS. Japan AWS Top Engineers 2020-2023.