Integrating SageMaker Inference Endpoint with API Gateway REST API

This note describes how to set up the integration using API Gateway’s integration request feature, eliminating the need for AWS Lambda functions.
Finding SageMaker Inference Endpoint
Navigate to SageMaker Console and go to Endpoint summary > URL
.
The endpoint format is as follows:
https://runtime.sagemaker.<ENDPOINT_REGION>.amazonaws.com/endpoints/<ENDPOINT_NAME>/invocations
Ensure a valid Authorization
header is included for the endpoint to work correctly. Read more on the official documentation.
Endpoints are scoped to an individual account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller.
Building REST API Integrated with SageMaker Inference Endpoint
Select REST API
in the API Gateway Console.
Assign a name to your API.
Select Actions -> Create Method
.
Choose the HTTP method type (POST
is used in this example).
Configure integration request:
- Integration type: AWS Service
- AWS Service: SageMaker Runtime (NOT SageMaker)
- HTTP method: POST
- Action Type: Use path override
- Path override:
endpoints/<ENDPOINT_NAME>/invocations
- Execution role: IAM role for API (must include the
sagemaker:InvokeEndpoint
action) - Content Handling: Passthrough
If your model accepts binary input (e.g., images), add the relevant MIME type (e.g., image/*
) in the Binary Media Types.
Without this configuration, you may encounter the following error:
{ "ErrorCode": "CLIENT_ERROR_FROM_MODEL", "LogStreamArn": "arn:aws:logs:ap-northeast-1:xxxxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/<ENDPOINT_NAME>", "Message": "Received client error (400) from primary with message \"unable to evaluate payload provided\". See https://ap-northeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-northeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/<ENDPOINT_NAME> in account xxxxxxxxxxxx for more information.", "OriginalMessage": "unable to evaluate payload provided", "OriginalStatusCode": 400}
Select Deploy API
in the API Gateway console.
Assign a stage for the deployment.
Once deployed, the API endpoint will be available for testing.
Testing
You can test the deployed API using the following curl
command:
curl --location '<API_ENDPOINT>' \ --header 'Content-Type: image/jpeg' \ --header 'Accept: application/json' \ --data-binary '@/path/to/image.jpg'