How to Count Objects Using AWS SageMaker Object Detection

This example guides you through:
- Labeling images using Ground Truth
- Training and deploying your models
- Performing inference
Images in this post are illustrative and not associated with specific customer projects.
In this note, a SageMaker inference endpoint is used for testing purposes.
Labeling with Ground Truth
Creating Labeling Workforce
To start labeling, you first need to set up your labeling workforce. In this note, a private workforce is created. Team members can authenticate using Cognito or OIDC.
Once the workforce is created, an invitation email is sent to the workers. This email includes the URL for the labeling portal.
You can also retrieve the labeling portal URL by navigating to Private workforce summary > Labeling portal sign-in URL
in the SageMaker management console.
Workers must follow the instructions in the invitation email to sign up and access the labeling portal.
An example of the invitation email is as follows:
Hi,
You are invited by jane.doe@xyz.com from <COMPANY> to work on a labeling project.
Click on the link below to log into your labeling project."https://<LABELING_PORTAL_URL>"
You will need the following username and temporary password provided below to login for the first time.User name: <USER_NAME>Temporary password: <PASSWORD>
Once you log in with your temporary password, you will be required to create a new password for your account.After creating a new password, you can log into your private team to access your labeling project.
If you have any questions, please contact us at jane.doe@xyz.com.
After accessing the URL, workers must enter the username and temporary password from the invitation email.
They will then be prompted to change their temporary password to a new one.
Upon successful login, workers are redirected to the top page of the labeling portal. Any assigned labeling jobs will appear on this page.
Creating Labeling Job
Navigate back to the SageMaker management console and start creating a new labeling job. Fill in the necessary fields as shown in the image below. Make sure to click Complete data setup
to finalize the process.
After creating the labeling job, it cannot be deleted. Use a unique value, such as one generated with the following command: uuidgen | tr "[:upper:]" "[:lower:]"
.
For complex labeling tasks, consider specifying a longer value for the Task timeout
parameter.
Starting Labeling
After signing in to the labeling portal, the labeling job you just created should appear. Click the Start working
button to begin.
It may take some time for the labeling job to appear in the list.
Follow the job instructions to label the dataset. Below is an example of a labeled dataset.
Once all workers have completed their tasks, stop the labeling job.
Checking Labeling Output
After the labeling job is stopped, the final output will be available in the specified S3 bucket. For object detection tasks, the file manifests/output/output.manifest
is important. Refer to the official documentation for more details.
annotation-tool/annotations/consolidated-annotation/worker-response/manifests/intermediate/output/output.manifesttemp/
Ground Truth produces labeling results in Augmented Manifest format. For additional details, check the official documentation.
Training with SageMaker
Once the labeling process is complete, proceed to train your model using the SageMaker console. Configure the training job as follows:
- Job settings
- Job name: Use a unique value (e.g.,
uuidgen | tr "[:upper:]" "[:lower:]"
). - Algorithm source: SageMaker built-in algorithm
- Choose an algorithm:
- Algorithm: Vision - Object Detection (MXNet)
- Input mode:
Pipe
- Resource configuration
- Instance type: Use a GPU instance like
ml.p2.xlarge
. - Only GPU instances support SageMaker object detection algorithms.
- Hyperparameters
num_classes
: Set to the number of object classes (e.g.,1
in this post).num_training_samples
: Equal to the number of lines in the manifest file.- Input data configuration
- Training channel
- Channel name:
train
- Input mode:
Pipe
- Content type:
application/x-recordio
- Record wrapper:
RecordIO
- Data source: S3 (Augmented Manifest File)
- Attribute names: Include attributes like
source-ref
and bounding box data keys. - S3 location: Specify the S3 URI for the training data manifest file.
- Validation channel
- Channel name:
validation
- Output data configuration
- S3 location: Specify the S3 URI for storing model artifacts.
By utilizing the Augmented Manifest format, you can use Pipe
input mode and the RecordIO
wrapper type without the need to create additional RecordIO files. Refer to the official documentation for more details:
Inference
Creating Model from Training Job
To create a model from the completed training job, click Create model
in the SageMaker console.
Deploying the Model
After creating the model, deploy it by clicking Create endpoint
. For cost-efficient and infrequent usage, consider using a serverless endpoint.
Making Requests
Locate the SageMaker inference endpoint on the endpoint detail page. This endpoint can be accessed using tools like curl
, Postman, or custom applications.
Directly using SageMaker inference endpoints for production workloads is not recommended. This example demonstrates endpoint usage for testing purposes only.
Example: Postman Configuration
Use the following parameters for authentication with AWS Signature V4:
- AccessKey
- SecretKey
- Session Token: Use a temporary credential instead of a permanent one.
- AWS Region: The region of your SageMaker endpoint.
- Service Name:
sagemaker
Set the Accept: application/json
header in the request.
Since the trained model expects binary image input, ensure your image is provided in binary format.
Example: Using AWS SDK (boto3)
You can also perform inference programmatically using the boto3 invoke_endpoint
API. Below is an example script:
import json
import boto3
# Initialize SageMaker runtime clientruntime = boto3.client('sagemaker-runtime')
# Define endpoint and input detailsendpoint_name = '<YOUR_ENDPOINT_NAME>'content_type = 'application/x-image'payload = None
# Read the image file in binary modewith open('/path/to/image.jpg', 'rb') as f: payload = f.read()
# Invoke the endpointresponse = runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType=content_type, Body=payload)
# Parse and display the responsebody = response['Body'].read()predictions = json.loads(body.decode())print(json.dumps(predictions, indent=2))
# Save the response to a filewith open('./response.json', 'w') as f: json.dump(predictions, f, indent=2)
Checking Response
The response is returned in JSON format, containing:
- Class Label IDs
- Confidence Scores
- Bounding Box Coordinates
The bounding box coordinates are relative to the actual image dimensions. For more details, refer to the official documentation.
{ "prediction": [ [ 0.0, 0.9953756332397461, 0.3821756839752197, 0.007661208510398865, 0.525381863117218, 0.19436971843242645 ], [ 0.0, 0.9928023219108582, 0.3435703217983246, 0.23781903088092804, 0.5533013343811035, 0.6385164260864258 ], [ 0.0, 0.9911478757858276, 0.15510153770446777,... 0.9990172982215881 ] ]}
Visualizing Response
To interpret the results of the inference visually, you can use Jupyter Notebook along with matplotlib.
The following Python script demonstrates how to overlay bounding boxes and annotations on the input image.
import json
import matplotlib.patches as patchesimport matplotlib.pyplot as pltfrom PIL import Image
# Configure plotplt.figure()axes = plt.axes()
# Read an imageim = Image.open('/path/to/image.jpg')# Display the imageplt.imshow(im)
# Read SageMaker inference predictionswith open('response.json') as f: predictions = json.loads(f.read())['prediction']
# Set initial countcount = 0
# Create rectanglesfor prediction in predictions: score = prediction[1] if score < 0.2: continue
# Count up count += 1
x = prediction[2] * im.width y = prediction[3] * im.height width = prediction[4] * im.width - x height = prediction[5] * im.height - y
rect = patches.Rectangle((x, y), width, height, linewidth=1, edgecolor='r', facecolor='none') axes.annotate(count, (x + width / 2, y + height / 2), color='yellow', weight='bold', fontsize=18, ha='center', va='center') axes.add_patch(rect)
# Display the rectanglesplt.show()
The script reads the prediction response in JSON format, extracts the bounding box coordinates, and draws rectangles around the detected objects. Each bounding box is annotated with its corresponding object count.