Real-Time Face Detection with Raspberry Pi, Kinesis Video Streams, and AWS Rekognition

Real-Time Face Detection with Raspberry Pi, Kinesis Video Streams, and AWS Rekognition

Takahiro Iwasa
Takahiro Iwasa
7 min read
Kinesis Video Streams Rekognition Video

In this note, we will implement a face detection system using a USB camera connected to a Raspberry Pi, leveraging Kinesis Video Streams and Rekognition Video for processing and detecting faces in real-time.

Requirements

Hardware Requirements

Software Requirements

Building AWS Resources

AWS SAM Template

template.yaml
AWSTemplateFormatVersion: 2010-09-09
Transform: AWS::Serverless-2016-10-31
Description: face-detector-using-kinesis-video-streams
Resources:
Function:
Type: AWS::Serverless::Function
Properties:
FunctionName: face-detector-function
CodeUri: src/
Handler: app.lambda_handler
Runtime: python3.11
Architectures:
- arm64
Timeout: 3
MemorySize: 128
Role: !GetAtt FunctionIAMRole.Arn
Events:
KinesisEvent:
Type: Kinesis
Properties:
Stream: !GetAtt KinesisStream.Arn
MaximumBatchingWindowInSeconds: 10
MaximumRetryAttempts: 3
StartingPosition: LATEST
FunctionIAMRole:
Type: AWS::IAM::Role
Properties:
RoleName: face-detector-function-role
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service: lambda.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
- arn:aws:iam::aws:policy/service-role/AWSLambdaKinesisExecutionRole
Policies:
- PolicyName: policy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- kinesisvideo:GetHLSStreamingSessionURL
- kinesisvideo:GetDataEndpoint
Resource: !GetAtt KinesisVideoStream.Arn
KinesisVideoStream:
Type: AWS::KinesisVideo::Stream
Properties:
Name: face-detector-kinesis-video-stream
DataRetentionInHours: 24
RekognitionCollection:
Type: AWS::Rekognition::Collection
Properties:
CollectionId: FaceCollection
RekognitionStreamProcessor:
Type: AWS::Rekognition::StreamProcessor
Properties:
Name: face-detector-rekognition-stream-processor
KinesisVideoStream:
Arn: !GetAtt KinesisVideoStream.Arn
KinesisDataStream:
Arn: !GetAtt KinesisStream.Arn
RoleArn: !GetAtt RekognitionStreamProcessorIAMRole.Arn
FaceSearchSettings:
CollectionId: !Ref RekognitionCollection
FaceMatchThreshold: 80
DataSharingPreference:
OptIn: false
KinesisStream:
Type: AWS::Kinesis::Stream
Properties:
Name: face-detector-kinesis-stream
StreamModeDetails:
StreamMode: ON_DEMAND
RekognitionStreamProcessorIAMRole:
Type: AWS::IAM::Role
Properties:
RoleName: face-detector-rekognition-stream-processor-role
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service: rekognition.amazonaws.com
Action: sts:AssumeRole
ManagedPolicyArns:
- arn:aws:iam::aws:policy/service-role/AmazonRekognitionServiceRole
Policies:
- PolicyName: policy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- kinesis:PutRecord
- kinesis:PutRecords
Resource:
- !GetAtt KinesisStream.Arn

Lambda Function

src/app.py
import base64
import json
import logging
from datetime import datetime, timedelta, timezone
from functools import cache
import boto3
JST = timezone(timedelta(hours=9))
kvs_client = boto3.client('kinesisvideo')
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
def lambda_handler(event: dict, context: dict) -> dict:
for record in event['Records']:
base64_data = record['kinesis']['data']
stream_processor_event = json.loads(base64.b64decode(base64_data).decode())
# Refer to https://docs.aws.amazon.com/rekognition/latest/dg/streaming-video-kinesis-output.html for details on the structure.
if not stream_processor_event['FaceSearchResponse']:
continue
logger.info(stream_processor_event)
url = get_hls_streaming_session_url(stream_processor_event)
logger.info(url)
return {
'statusCode': 200,
}
@cache
def get_kvs_am_client(api_name: str, stream_arn: str):
# Retrieves the data endpoint for the stream.
# See https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kinesisvideo/client/get_data_endpoint.html
endpoint = kvs_client.get_data_endpoint(
APIName=api_name.upper(),
StreamARN=stream_arn
)['DataEndpoint']
return boto3.client('kinesis-video-archived-media', endpoint_url=endpoint)
def get_hls_streaming_session_url(stream_processor_event: dict) -> str:
# Generates an HLS streaming URL for the video stream.
# See https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kinesis-video-archived-media/client/get_hls_streaming_session_url.html
kinesis_video = stream_processor_event['InputInformation']['KinesisVideo']
stream_arn = kinesis_video['StreamArn']
kvs_am_client = get_kvs_am_client('get_hls_streaming_session_url', stream_arn)
start_timestamp = datetime.fromtimestamp(kinesis_video['ServerTimestamp'], JST)
end_timestamp = datetime.fromtimestamp(kinesis_video['ServerTimestamp'], JST) + timedelta(minutes=1)
return kvs_am_client.get_hls_streaming_session_url(
StreamARN=stream_arn,
PlaybackMode='ON_DEMAND',
HLSFragmentSelector={
'FragmentSelectorType': 'SERVER_TIMESTAMP',
'TimestampRange': {
'StartTimestamp': start_timestamp,
'EndTimestamp': end_timestamp,
},
},
ContainerFormat='FRAGMENTED_MP4',
Expires=300,
)['HLSStreamingSessionURL']

Deploying the Stack

Build and deploy the SAM application using the following commands:

Terminal window
sam build
sam deploy

Indexing Faces

To detect faces using the USB camera, index (register) faces into a Rekognition face collection beforehand. The IndexFaces API is used for this purpose.

Replace the following with the actual values:

  • <YOUR_BUCKET>
  • <YOUR_OBJECT>
  • <PERSON_ID>
Terminal window
aws rekognition index-faces \
--image '{"S3Object": {"Bucket": "<YOUR_BUCKET>", "Name": "<YOUR_OBJECT>"}}' \
--collection-id FaceCollection \
--external-image-id <PERSON_ID>

Rekognition does not store actual images in the face collection. Instead, it extracts and saves facial features as metadata.

https://docs.aws.amazon.com/rekognition/latest/dg/add-faces-to-collection-procedure.html

For each face detected, Amazon Rekognition extracts facial features and stores the feature information in a database. In addition, the command stores metadata for each face that’s detected in the specified face collection. Amazon Rekognition doesn’t store the actual image bytes.

Setting Up the Video Producer

This example uses the Raspberry Pi 4B with 4GB RAM running Ubuntu 23.10 as the video producer.

Raspberry Pi and USB Camera Setup

Building GStreamer Plugin

AWS provides the Amazon Kinesis Video Streams CPP Producer, GStreamer Plugin and JNI. This SDK facilitates video streaming from the Raspberry Pi to Kinesis Video Streams.

ℹ️ Note

While AWS offers a Docker image for the GStreamer plugin, the image may not work on Raspberry Pi due to architecture limitations.

Run the following commands. Depending on your system’s specifications, the build may take 20 minutes or more.

Terminal window
sudo apt update
sudo apt upgrade
sudo apt install \
make \
cmake \
build-essential \
m4 \
autoconf \
default-jdk
sudo apt install \
libssl-dev \
libcurl4-openssl-dev \
liblog4cplus-dev \
libgstreamer1.0-dev \
libgstreamer-plugins-base1.0-dev \
gstreamer1.0-plugins-base-apps \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-ugly \
gstreamer1.0-tools
git clone https://github.com/awslabs/amazon-kinesis-video-streams-producer-sdk-cpp.git
mkdir -p amazon-kinesis-video-streams-producer-sdk-cpp/build
cd amazon-kinesis-video-streams-producer-sdk-cpp/build
sudo cmake .. -DBUILD_GSTREAMER_PLUGIN=ON -DBUILD_JNI=TRUE
sudo make

Once the build completes, verify the result with the following commands:

Terminal window
cd ~/amazon-kinesis-video-streams-producer-sdk-cpp
export GST_PLUGIN_PATH=`pwd`/build
export LD_LIBRARY_PATH=`pwd`/open-source/local/lib
gst-inspect-1.0 kvssink

The output should display details similar to this:

Factory Details:
Rank primary + 10 (266)
Long-name KVS Sink
Klass Sink/Video/Network
Description GStreamer AWS KVS plugin
Author AWS KVS <kinesis-video-support@amazon.com>
...

To avoid resetting environment variables every time, add the following exports to your ~/.profile:

Terminal window
echo "" >> ~/.profile
echo "# GStreamer" >> ~/.profile
echo "export GST_PLUGIN_PATH=$GST_PLUGIN_PATH" >> ~/.profile
echo "export LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> ~/.profile

Running GStreamer

After building the plugin, connect your USB camera to the Raspberry Pi and run the following command to stream video data to Kinesis Video Streams.

Be sure to replace the following with the actual values:

  • <KINESIS_VIDEO_STREAM_NAME>
  • <YOUR_ACCESS_KEY>
  • <YOUR_SECRET_KEY>
  • <YOUR_AWS_REGION>
🔥 Caution

Enhancing video quality (e.g., increasing resolution or frame rate) may result in higher AWS costs.

Terminal window
gst-launch-1.0 -v v4l2src device=/dev/video0 \
! videoconvert \
! video/x-raw,format=I420,width=320,height=240,framerate=5/1 \
! x264enc bframes=0 key-int-max=45 bitrate=500 tune=zerolatency \
! video/x-h264,stream-format=avc,alignment=au \
! kvssink stream-name=<KINESIS_VIDEO_STREAM_NAME> storage-size=128 access-key="<YOUR_ACCESS_KEY>" secret-key="<YOUR_SECRET_KEY>" aws-region="<YOUR_AWS_REGION>"

You can verify the live stream by navigating to the Kinesis Video Streams management console.

Kinesis Video Streams Management Console

Testing

Starting the Rekognition Video Stream Processor

Start the Rekognition Video stream processor. This service subscribes to the Kinesis Video Stream, detects faces using the face collection, and streams the results to the Kinesis Data Stream.

Run the following command to start the stream processor:

Terminal window
aws rekognition start-stream-processor \
--name face-detector-rekognition-stream-processor

Verify the status of the stream processor to ensure it is running:

Terminal window
aws rekognition describe-stream-processor \
--name face-detector-rekognition-stream-processor | grep "Status"

The expected output should show "Status": "RUNNING".

Capturing Faces

Once the USB camera captures video, the Rekognition Video stream processor analyzes the video stream and detects faces based on the face collection.

To check the results, view the Lambda function logs with the following command:

Terminal window
sam logs -n Function \
--stack-name face-detector-using-kinesis-video-streams \
--tail

The log records include detailed information about the stream processor events, such as the following example:

{
"InputInformation": {
"KinesisVideo": {
"StreamArn": "arn:aws:kinesisvideo:<AWS_REGION>:<AWS_ACCOUNT_ID>:stream/face-detector-kinesis-video-stream/xxxxxxxxxxxxx",
"FragmentNumber": "91343852333181501717324262640137742175000164731",
"ServerTimestamp": 1702208586.022,
"ProducerTimestamp": 1702208585.699,
"FrameOffsetInSeconds": 0.0,
}
},
"StreamProcessorInformation": {"Status": "RUNNING"},
"FaceSearchResponse": [
{
"DetectedFace": {
"BoundingBox": {
"Height": 0.4744676,
"Width": 0.29107505,
"Left": 0.33036956,
"Top": 0.19599175,
},
"Confidence": 99.99677,
"Landmarks": [
{"X": 0.41322955, "Y": 0.33761832, "Type": "eyeLeft"},
{"X": 0.54405355, "Y": 0.34024307, "Type": "eyeRight"},
{"X": 0.424819, "Y": 0.5417343, "Type": "mouthLeft"},
{"X": 0.5342691, "Y": 0.54362005, "Type": "mouthRight"},
{"X": 0.48934412, "Y": 0.43806323, "Type": "nose"},
],
"Pose": {"Pitch": 5.547308, "Roll": 0.85795176, "Yaw": 4.76913},
"Quality": {"Brightness": 57.938313, "Sharpness": 46.0298},
},
"MatchedFaces": [
{
"Similarity": 99.986176,
"Face": {
"BoundingBox": {
"Height": 0.417963,
"Width": 0.406223,
"Left": 0.28826,
"Top": 0.242463,
},
"FaceId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"Confidence": 99.996605,
"ImageId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"ExternalImageId": "iwasa",
},
}
],
}
],
}

HLS URL for Video Playback

The logs also include the generated HLS URL for on-demand video playback, such as:

https://x-xxxxxxxx.kinesisvideo.<AWS_REGION>.amazonaws.com/hls/v1/getHLSMasterPlaylist.m3u8?SessionToken=xxxxxxxxxx

Open the HLS URL using a supported browser like Safari or Edge.

ℹ️ Note

Chrome does not natively support HLS playback. You can use a third-party extension, such as Native HLS Playback.

HLS Playback Example

Cleaning Up

Clean up all the AWS resources provisioned during this example with the following command:

Terminal window
aws rekognition stop-stream-processor \
--name face-detector-rekognition-stream-processor
sam delete
Takahiro Iwasa

Takahiro Iwasa

Software Developer
Involved in the requirements definition, design, and development of cloud-native applications using AWS. Japan AWS Top Engineers 2020-2023.