Home
>
Data Science
>
Update Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference

March 22, 2022 by Phu Nguyen

Update Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference

Main Contents:

Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference in today’s post !

Read more about Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference at Wikipedia

You can find content about Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference from the Wikipedia website

Prisma Cloud from Palo Alto Networks is sponsoring our coverage of AWS re:Invent 2021.

This guide is the last part of a series covering the Amazon SageMaker Studio Lab.

As we mentioned in previous posts, Amazon SageMaker Studio Lab is a standalone service that allows users to experiment with building machine learning models. It has no dependencies on Amazon Web Services itself. The environment is based on the popular and familiar JupyterLab notebooks. JupyterLab is the only commonality between Studio Lab and Studio available from the AWS Console. Anyone with an email account can sign up for the service.

The service is completely free. Amazon has opened up an IDE and environment for building machine learning models with no strings attached. This may be the first AWS service that lives outside of the IAM realm with an infinite number of free tier hours.

Except for the branding, the service has almost nothing to do with SageMaker.

In previous posts, we explored SageMaker Studio Lab basics and the SageMaker Serverless Inference. This tutorial will take the next step, and will show how to publish serverless inference endpoints for TensorFlow models.

When you have a model trained within SageMaker Studio Lab or any other environment, you can host that model within the SageMaker Studio environment for inference at scale. If you have followed the steps to train the image classification model based on the cats vs. dogs dataset, you can extend the scenario to deploy the same model within the SageMaker Serverless Inference service.

SageMaker architecture.

Prerequisites

You need the following to complete this tutorial:

AWS account
Access Key and Secret Key of your AWS account
SageMaker Execution Role

Follow the steps mentioned in the Amazon SageMaker documentation to create the SageMaker IAM role with the appropriate permissions required to deploy the model.

Step 1: Preparing the Environment

Amazon SageMaker Studio Lab comes with the AWS CLI, which can be used to configure the environment. For this tutorial, we will use the Jupyter notebook and AWS SDK for Python (Boto3) to configure the credentials expected by the SDK.

Sponsor Note

sponsor logo

Prisma Cloud delivers the industry’s broadest security and compliance coverage—for applications, data, and the entire cloud native technology stack—throughout the development lifecycle and across multi- and hybrid-cloud environments.

Run the below commands in a new notebook based on the tf2:python kernel created in the previous tutorial.

<br />
!mkdir -p ~/.aws/<br />
#Replace with your keys<br />
%%writefile ~/.aws/credentials<br />
[default]<br />
aws_access_key_id = AWS_ACCESS_KEY<br />
aws_secret_access_key = AWS_SECRET_KEY

!mkdir –p ~/.aws/

#Replace with your keys

%%writefile ~/.aws/credentials

[default]

aws_access_key_id = AWS_ACCESS_KEY

aws_secret_access_key = AWS_SECRET_KEY

Don’t forget to replace the credentials with your own keys.

<br />
%%writefile ~/.aws/config<br />
[default]<br />
region=eu-west-1

%%writefile ~/.aws/config

[default]

region=eu–west–1

These commands configure the AWS environment expected by Boto3.

Let’s prepare the model by archiving it into a tarball. This will be later uploaded to an Amazon S3 bucket for registering it with SageMaker.

<br />
import tarfile<br />
model_archive=”../model/model.tar.gz”<br />
with tarfile.open(model_archive, mode=”w:gz”) as archive:<br />
archive.add(‘../model/export’, recursive=True)

import tarfile

model_archive = ‘../model/model.tar.gz’

with tarfile.open(model_archive, mode=‘w:gz’) as archive:

archive.add(‘../model/export’, recursive=True)

Finally, set the variables used to configure the inference endpoints.

<br />
region=’eu-west-1′<br />
sagemaker_role = SAGEMAKER_ROLE_ARN<br />
container = “763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:2.6.0-cpu-py38-ubuntu20.04”

region=‘eu-west-1’

sagemaker_role = SAGEMAKER_ROLE_ARN

container = “763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:2.6.0-cpu-py38-ubuntu20.04”

Don’t forget to replace SAGEMAKER_ROLE_ARN with the ARN created as a part of the prerequisites. We are setting the AWS region to Dublin. Feel free to replace it with any of the supported regions of serverless inference feature. The last line points to the container image that will be used by SageMaker during the creation and registration of the model. The TensorFlow Saved model will be mounted within this container that already has the code for inference. In case you choose a different region other than eu-west-1, update the image appropriately. You can access the list of available images here.

Step 2: Creating Amazon SageMaker Model

In this step, we will upload the model tarball to an S3 bucket and associate it with the deep learning container image for inference.

<br />
import boto3<br />
import sagemaker<br />
from sagemaker import Session</p>
<p>region = boto3.Session().region_name<br />
sess = Session()<br />
bucket = sess.default_bucket()<br />
client = boto3.client(“sagemaker”, region_name=region)</p>
<p>model_url = sess.upload_data(path=model_archive, key_prefix=’model’)<br />
model_name = “dogs-vs-cats”<br />
response = client.create_model( ModelName = model_name, ExecutionRoleArn = sagemaker_role, Containers = [{ “Image”: container, “Mode”: “SingleModel”, “ModelDataUrl”: model_url, }] )

import boto3

import sagemaker

from sagemaker import Session

region = boto3.Session().region_name

sess = Session()

bucket = sess.default_bucket()

client = boto3.client(“sagemaker”, region_name=region)

model_url = sess.upload_data(path=model_archive, key_prefix=‘model’)

model_name = “dogs-vs-cats”

response = client.create_model( ModelName = model_name, ExecutionRoleArn = sagemaker_role, Containers = [{ “Image”: container, “Mode”: “SingleModel”, “ModelDataUrl”: model_url, }] )

The last code snippet has everything SageMaker needs to create a model with the name dogs-vs-cats.

If you access the S3 bucket used by Amazon SageMaker, you will find the model tarball.

chose data souurce

If you navigate to the models section of SageMaker in AWS Console, you will see the model registered with it.

the data model

Step 3: Defining SageMaker Serverless Inference Endpoint Configuration

This is the most crucial step where we configure the endpoint for serverless inference.

<br />
response = client.create_endpoint_config(<br />
EndpointConfigName=”dogs-vs-cats”,<br />
ProductionVariants=[<br />
{<br />
“ModelName”: “dogs-vs-cats”,<br />
“VariantName”: “AllTraffic”,<br />
“ServerlessConfig”: {<br />
“MemorySizeInMB”: 2048,<br />
“MaxConcurrency”: 20<br />
}<br />
}<br />
]<br />
)

response = client.create_endpoint_config(

EndpointConfigName=“dogs-vs-cats”,

ProductionVariants=[

{

“ModelName”: “dogs-vs-cats”,

“VariantName”: “AllTraffic”,

“ServerlessConfig”: {

“MemorySizeInMB”: 2048,

“MaxConcurrency”: 20

}

]

)

The ServerlessConfig attribute is a hint to SageMaker runtime to provision serverless compute resources that are autoscaled based on the parameters — 2GB RAM and 20 concurrent invocations.

When you finish executing this, you can spot the same in AWS Console.

Step 4: Creating the Serverless Inference Endpoint

We are ready to create the endpoint based on the configuration defined in the previous step.

<br />
response = client.create_endpoint(<br />
EndpointName=”dogs-vs-cats”,<br />
EndpointConfigName=”dogs-vs-cats”<br />
)

response = client.create_endpoint(

EndpointName=“dogs-vs-cats”,

EndpointConfigName=“dogs-vs-cats”

)

This results in the final inference endpoint being ready to accept requests.

define inference endpoint.

Step 5: Invoking the Serverless Inference Endpoint

Let’s go ahead and test the endpoint by sending the images of a dog.

<br />
results={<br />
0:’cat’,<br />
1:’dog’<br />
}<br />
Image_Width=128<br />
Image_Height=128<br />
Image_Size=(Image_Width,Image_Height)<br />
Image_Channels=3</p>
<p>from PIL import Image<br />
import numpy as np img_file=”../images/image2.jpg”<br />
im=Image.open(img_file) im=im.resize(Image_Size)<br />
im=np.expand_dims(im,axis=0)<br />
im=np.array(im)<br />
im=im/255 </p>
<p>import boto3<br />
import json<br />
runtime = boto3.client(“sagemaker-runtime”)<br />
endpoint_name = “dogs-vs-cats”<br />
content_type = “application/json”<br />
payload = json.dumps({“instances”: im.tolist()}) </p>
<p>response = runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType=content_type, Body=payload ) pred=json.load(response[‘Body’])<br />
results[np.argmax(pred[‘predictions’])]

results={

0:‘cat’,

1:‘dog’

}

Image_Width=128

Image_Height=128

Image_Size=(Image_Width,Image_Height)

Image_Channels=3

from PIL import Image

import numpy as np img_file=“../images/image2.jpg”

im=Image.open(img_file) im=im.resize(Image_Size)

im=np.expand_dims(im,axis=0)

im=np.array(im)

im=im/255

import boto3

import json

runtime = boto3.client(“sagemaker-runtime”)

endpoint_name = “dogs-vs-cats”

content_type = “application/json”

payload = json.dumps({“instances”: im.tolist()})

response = runtime.invoke_endpoint( EndpointName=endpoint_name, ContentType=content_type, Body=payload ) pred=json.load(response[‘Body’])

results[np.argmax(pred[‘predictions’])]

You should see the endpoint classifying the image correctly.

image being classified.

This concludes the tutorial on publishing serverless inference endpoints for TensorFlow models. Hope you found it useful.

Amazon Web Services is a sponsor of InApps Technology.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

May 29, 2025 by Anh Hoang

Update Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference

Read more about Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference at Wikipedia

Prerequisites

Step 1: Preparing the Environment

Step 2: Creating Amazon SageMaker Model

Step 3: Defining SageMaker Serverless Inference Endpoint Configuration

Step 4: Creating the Serverless Inference Endpoint

Step 5: Invoking the Serverless Inference Endpoint

AI Automation for Business in 2025: A Step-by-Step Guide

FITNESS APP DEVELOPMENT

ONLINE COURSE APP

EVE HR – WEB DESIGN

AIRGOGO WEBSITE

WALLET APP DEVELOPMENT

Ho Chi Minh City Launches Digital Traffic App 2017

Why Your Business Needs a Mobile App Rather Than a Website

7 Questions To Ask Yourself Before You ‘App’ | Entrepreneur

Homestays Marketplace Application Development

Blog post

9 Practical Tips to Choose a Mobile App Development Company for 2023

AI Automation for Business in 2025: A Step-by-Step Guide

Top 10 Offshore Development Companies (ODCs) in 2025

How can businesses effectively integrate AI into their operations?

Locations

Read more about Tutorial: Deploying TensorFlow Models with Amazon SageMaker Serverless Inference at Wikipedia

Prerequisites

Step 1: Preparing the Environment

Step 2: Creating Amazon SageMaker Model

Step 3: Defining SageMaker Serverless Inference Endpoint Configuration

Step 4: Creating the Serverless Inference Endpoint

Step 5: Invoking the Serverless Inference Endpoint

Get a custom Proposal

You need to enter your email to download

Blog post

Locations