Author avatar

Dr. Emmanuel Tsukerman

Build Your First Deep Learning Solution with AWS Sagemaker

Dr. Emmanuel Tsukerman

  • May 28, 2020
  • 13 Min read
  • May 28, 2020
  • 13 Min read
Data Analytics
Machine learning
Amazon Sagemaker


As of February 2020, Canalys reports that Amazon Web Services (AWS) is the definite cloud computing market leader, with a share of 32.4%, followed by Azure at 17.6%, Google Cloud at 6%, Alibaba Cloud close behind at 5.4%, and other clouds with 38.5%. This guide is here to help you get onboarded with Deep Learning on Amazon Sagemaker at lightning speed and will be especially useful to you if:

a) you have been entrusted to determine which cloud provider is best suited to solve your organization's needs, b) you need to ramp up quickly on AWS SageMaker to follow a talk on the subject, or c) you are looking to quickly get initiated into an existing AWS infrastructure.

Amazon Web Services (AWS) is the leading cloud-compute provider

This guide will cover the most important pieces you need to build a Deep Learning Solution with SageMaker. In particular, we are going to cover training a model, converting and deploying the model, and using it for predictions.

Table of Contents

The sections in this guide are self-contained and will allow you to utilize your skills on pre-existing solutions.

About Amazon SageMaker

As organizations grow and scale up, they find themselves facing the important decision of which cloud provider to migrate their machine learning (ML) ecosystem into. Naturally, one of the top candidates is AWS. Amazon SageMaker is one of the services available on AWS, and it is a cloud ML platform that enables developers to create, train, and deploy ML models in the cloud. You can think of SageMaker as having a carefully architected balance between the flexibility of a low-level service offered by spinning up EC21 Virtual Machines (VMs) in an ad-hoc manner and the convenience of high-level AWS services like Rekonition and Comprehend2. Some of SageMaker’s advantages include:

  • Providing pre-trained ML models for deployment

  • Providing a number of built-in ML algorithms that you can train on your own data

  • Providing managed instances of TensorFlow and Apache MXNet which you can leverage for creating your own ML algorithms

How To Train Your Own TensorFlow Model on SageMaker

This section will walk you through training a traditional Keras Convolutional Neural Network (CNN) on Amazon SageMaker. The training of the model in this section can be optionally performed outside of AWS, but serves to illustrate how you can perform the training on SageMaker.

Start by going back to the AWS Management Console and type in Amazon SageMaker. Click the link for SageMaker:

Accessing Amazon SageMaker

You will be brought to the SageMaker dashboard:

SageMaker Dashboard

You can see there are a lot of tasks SageMaker can help you manage, such as labeling jobs and purchasing models. Our goal is to train a model from scratch, so click on Notebook instances.

SageMaker Notebook Instances Dashboard

You can see all of your SageMaker notebooks here. Now click Create notebook instance.

Creating a SageMaker Notebook Instance

Give your notebook a name, such as my-first-sagemaker-notebook. Proceed to Permissions and encryption, where you will click Create a new role in the dropdown menu.

Creating a New Role

To avoid getting bogged down in security and permissions details, for this guide select Any S3 bucket and hit Create role.

Setting the Permissions to Your Notebook Instance

Once you are finished with the configurations, hit Create notebook instance. Wait for the notebook to load, and once it’s ready, click on open jupyter and then create a new notebook using the kernel conda_tensorflow_p36. We are finally ready to code.

For compatibility and consistency, fix a version of TensorFlow—in this case, version 1.12.

1!pip install tensorflow==1.12

Next, train a basic Keras CNN on MNIST. If you know the details—great! If you’re not familiar, don't worry, the details are tangential to this guide. At a high level, you download the MNIST data, reshape it, and then train a TensorFlow CNN.

1from tensorflow.keras.models import Sequential
2from tensorflow.keras.layers import Dense, Dropout, Flatten
3from tensorflow.keras.datasets import mnist
4from tensorflow.keras.utils import to_categorical
5from tensorflow.keras.layers import Conv2D, MaxPooling2D
6from tensorflow.keras.losses import categorical_crossentropy
7from sklearn import metrics
8from tensorflow.keras import backend as K
10img_rows, img_cols = 28, 28
11num_classes = 10
12batch_size = 128
13epochs = 1
15(x_train, y_train), (x_test, y_test) = mnist.load_data()
17if K.image_data_format() == 'channels_first':
18    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
19    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
20    input_shape = (1, img_rows, img_cols)
22    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
23    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
24    input_shape = (img_rows, img_cols, 1)
26x_train = x_train.astype('float32')
27x_test = x_test.astype('float32')
28x_train /= 255
29x_test /= 255
30y_train = to_categorical(y_train, num_classes)
31y_test = to_categorical(y_test, num_classes)
33model = Sequential()
34model.add(Conv2D(32, kernel_size=(3, 3),
35                 activation='relu',
36                 input_shape=input_shape))
37model.add(Conv2D(64, (3, 3), activation='relu'))
38model.add(MaxPooling2D(pool_size=(2, 2)))
41model.add(Dense(128, activation='relu'))
43model.add(Dense(num_classes, activation='softmax'))
46              optimizer='adam',
47              metrics=['accuracy'])
48, y_train,
50          batch_size=batch_size,
51          epochs=epochs,
52          verbose=1,
53          validation_data=(x_test, y_test))

Run this block. The computation is performed on the ml.t2.medium machine you specified when creating this notebook instance. When that training is finished, save the model.

1import os
2!mkdir "keras_model"
3save_path = "./keras_model/"
7model_json = model.to_json()
8with open(os.path.join(save_path,"model.json"), "w") as json_file:
9    json_file.write(model_json)

What you're doing is creating a folder named “keras model” and saving both the weights and architecture of our model to this folder.

To summarize this relatively straightforward section, you have used SageMaker to train a Keras model and then saved it to AWS. Alternatively, you could have uploaded a pre-trained model and then saved it to AWS. Now that you have a trained model, proceed to prepare it for deployment.

How to Convert a Model Into a SageMaker-Readable Format

This section is a little trickier. We will be covering some AWS-specific requirements for deploying a TensorFlow model as well as converting our Keras model to the TensorFlow ProtoBuf format. Prepare to be challenged, but if you stick with me, you should be OK.

Start a new notebook. Again, to avoid difficult-to-debug compatibility errors, run

1!pip install tensorflow==1.12

Now, load up the model you trained in the previous section.

1import tensorflow as tf
2from keras.models import model_from_json
4json_file = open('/home/ec2-user/SageMaker/keras_model/'+'model.json', 'r')
5loaded_model_json =
7loaded_model = model_from_json(loaded_model_json,custom_objects={"GlorotUniform": tf.keras.initializers.glorot_uniform})

Next, convert the model into the TensorFlow ProtoBuf format.

1from tensorflow.python.saved_model import builder
2from tensorflow.python.saved_model.signature_def_utils import predict_signature_def
3from tensorflow.python.saved_model import tag_constants
5model_version = '1'
6export_dir = 'export/Servo/' + model_version
8builder = builder.SavedModelBuilder(export_dir)
10signature = predict_signature_def(
11    inputs={"inputs": loaded_model.input}, outputs={"score": loaded_model.output})
13from keras import backend as K
14with K.get_session() as sess:
15    builder.add_meta_graph_and_variables(
16        sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})

The details here will take us too far away from our goal of deploying the model, so let's proceed. The last thing to do is upload the, now in ProtoBuf form, model to S3, the AWS service for cloud storage.

1import tarfile
2with'model.tar.gz', mode='w:gz') as archive:
3    archive.add('export', recursive=True)
5import sagemaker
6sagemaker_session = sagemaker.Session()
7inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')

To summarize this section, you started by loading a pre-trained Keras model and converted it to a TensorFlow ProtoBuf format. You then uploaded it to S3 for improved accessibility. Next, you'll deploy this model to an endpoint.

How to Deploy a Pre-Trained Model Using SageMaker

Start by retrieving your IAM role, which determines your user identity and permissions:

1import sagemaker
2from sagemaker import get_execution_role
3role = get_execution_role()
4sagemaker_session = sagemaker.Session()

To deploy, create an entry point, a requirement expected to be removed in the future. The entry point can be an empty file.


Now take the model, saved on S3, and use it to instantiate a SageMaker model.

1from sagemaker.tensorflow.serving import Model
2sagemaker_model = Model(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
3                                  role = role,
4                                  entry_point = '')

Finally, deploy the model.

2predictor = sagemaker_model.deploy(initial_instance_count=1,
3                                   instance_type='ml.m4.xlarge')

In summary, in this section you have taken an existing tar.gz file containing the TensorFlow ProtoBuf model and deployed it on an endpoint. In the next section, we will see how to use the endpoint to make predictions.

How to Predict Using the Model You Deployed on SageMaker

Finally, the fun part: enjoying the fruit of our labor. All we have to do is get a few samples to test against the model. So load up the MNIST dataset once more, using the same code you used in training:

1from tensorflow.keras.models import Sequential
3validation_data=(x_test, y_test))

Reshape it to the appropriate size.

1data = x_test[0]
2data = data.reshape(1,28,28,1)

And now it is ready to be sent to the endpoint. Note here the format in which the input is made:

1inp = {'instances': data}

And that’s all!


We have trained a Deep Learning model from scratch, converted it to a SageMaker-readable format, deployed it, and used it to make predictions. Congratulations! You're well on your way to skillfully using Amazon SageMaker to accomplish your and your organization's goals.