Training a model is tedious, especially when training scripts consume up all of your computer power and you can do nothing but wait. It happens to me all the time - either when I am starting the model building process or I am finalizing the model with parameter tuning. Luckily, big cloud vendors provide solutions to train and deploy your model in the cloud without having to use your local capacity. AWS, Azure, and GCP all have similar offerings and I am using AWS Sagemaker here to show a way of using your own containerized docker to train the model in the cloud.

Cloud training usually a nightmare to me when it comes to configuration. Different cloud vendors have a different structure for storage, instances, and APIs, which means we have to read through manuals and dev guides to make things function. I feel the same way when I started to use Sagemaker. But, instead of looking at their console and trying to find a solution in the UI, I find Sagemaker SDK pretty powerful. The typical or advertised usage of Sagemaker is by their pre-built algorithm. But unless if you are just shopping for a baseline model, you will have to use your own model code. Of course, you can study their manual and learn how to tune or modify their algorithm APIs, yet I believe there are more efficient ways.

So I created this beginner guide to showcase a way of utilizing Sagemaker training instance while remaining the capacity of using docker to train your own code. I hope the solution will help those who need it!

Prerequisite

This exercise will need some prerequisite setup to start with. If you have used Sagemaker SDK previously, you may skip this part. Otherwise, please follow the setup carefully.

  • AWS CLI installation & setup

Check this link to follow the instruction to download and install AWS CLI according to the system you are using. The example in this tutorial is using Mac-OS.

If you don’t have an account with AWS, you can sign up for one for free (note: it will require your credit card info, and be sure to read the free-tier offer in case any charges incurred).

Log in to the console and navigate yourself to IAM. Create a user and attach the SagemakerFullAccess policy. Also, create an access key under Security Credential and download the credential .csv file.

Image for post

AWS IAM Console

Once the AWS CLI is installed, you can set up with the credential .csv file you just setup. In the terminal, type in the following:

aws configure

If you are not using MFA with your user, simply fill in the info you acquired from the credential file, and you are done. Otherwise please also added your MFA profile in the configure file.

Image for post

Check your configuration. If your CLI is set up correctly, you should see the bucket listed under your account.

aws s3 ls

  • Pipenv Setup

This step is optional since you may use other ways for your environment. However, I will briefly walk through how to set up the Pipenv environment to let you easily follow along.

  1. Clone my repo here.
  2. Open up the project in your preferred SDE. I am using VSCode.
  3. Install pipenv with your pip. pip install pipenv
  4. Initialize the pipenv environment with a Python interpreter. I am using version 3.7. Usepipenv --python 3.7 to initialize.
  5. Install dependencies with pipenv install .

Setting up correctly and it should lead to something like this.

Image for post

Model Building

This example will use House Prices: Advanced Regression Techniques dataset, which contains 79 variables to predict the final price of each home. After comparing a bunch of models, the xgboost regressor stands out and our job is to fine-tune the regressor in the training scripts. RMSE will be the metric for model selection.

Wrap-up in Docker

If you are not familiarized with Docker, check this guide by AWS. What we gonna do here is to build our own docker image and push it to AWS ECR. Elastic Container Registry is a managed service by AWS to store and manage Docker images. To use it with Sagemaker SDK, we need to push the docker image to ECR to allow our training instance to pull images from there.

  • Dockerfile

Image for post

Dockerized Training Workflow (Image by Author)

The above diagram illustrates how we setup the Dockerfile. It is recommended to test your docker image locally before pushing it to ECR.

  • Local Dockerized Training

This example will use Sagemaker training API to test perform local training in the aforesaid Docker image.

  • Push Docker to ECR

AWS ECR provides a console for setting up the Docker repository within a few clicks. Check this link if you want to use the console. Alternatively, you can use the AWS CLI to set up the Docker repository with the command line.

Now our image is set and ready to use.

#docker #data-science #modeling #machine-learning #aws-sagemaker

Quick Guide to Dockerized Training in AWS Sagemaker (with Code Example)
1.30 GEEK