Sagify: Training and Deploying ML/DL models on AWS SageMaker made simple
As a Data Scientist or Machine Learning engineer, I’m sure you have faced the following situation: you are working on a Machine Learning or Deep Learning project, and you need to train and deploy your models in production. But you had enough creating your own recipe scripts to train the models on the cloud (AWS, Google Cloud, etc) or waiting for ages to train the models locally on your workstation.
This is why you started using AWS SageMaker! However, it is still not straightforward to deploy your own training/prediction code on SageMaker. You know, AWS is like IKEA! They provide you with all the required processed parts, screws, screwdrivers but at the end of the day it is still your responsibility to follow the instructions correctly and connect all the bits and pieces in the right way. What if you don’t want to go through all this hassle?
For those of you not looking to assemble your next IKEA library, but deploy your next Machine Learning or Deep Learning project on SageMaker, Sagify comes to the rescue!
So, wouldn’t it be AWESOME to code your own training logic, as you usually do, and then call a simple command from the terminal to execute it on AWS like this:
sagify cloud train -d local-src-dir/ -i s3://my-bucket/training-data/ -o s3://my-bucket/model-output/ -e ml.m4.xlarge
There are clear benefits in using the above command:
- No need to move data to your code, but code goes to data (
- Trained models are saved in timestamped subfolders under
s3://my-bucket/model-output/, for example:
- Easy way to specify EC2 instance.
- Code is deployed to AWS in a Docker image and pushed to AWS ECS.
- Training data from S3 are available to EC2 instance’s EBS storage seamlessly.
Let’s dive into an example on how to use Sagify. Please, follow the Getting Started section in docs for a complete walkthrough. Here is the gist of it:
Clone a Deep Learning codebase that will evaluate arithmetic additions on up to 3 digit integers
git clone https://github.com/Kenza-AI/deep-learning-addition.git
Initialize Sagify by executing the following command on your terminal:
sagify init -d src
Call your training logic in
train(…) function in
sagify/training/train file like:
except Exception as e:
Build the Docker image that will contain your code:
sagify build -d src -r requirements.txt
Push the code to AWS ECS:
sagify push -d src
Step 6 (Optional):
Upload the data to S3, if they are not already available there:
sagify cloud upload-data -d src -i data/processed/ -s s3://my-dl-addition/training-data
Finally, train your model on AWS SageMaker:
sagify cloud train -d src/ -i s3://my-dl-addition/training-data/ -o s3://my-dl-addition/output/ -e ml.m4.xlarge
I’m pretty sure that some of you have spotted something important here. The above commands can be orchestrated by tools such as Airflow to automate your training pipeline! Why is this important?
- No more manual execution of training.
- Keep track of training code, hyperparameters, trained models, etc on a storage like S3.
- Avoid situations like “it works on my laptop”.
- Catch issues early rather than later.
- Increase visibility enabling greater communication.
- Spend less time debugging and more time adding features.
Do these points remind you of anything?
Well, these are some of the benefits of Continuous Integration/Continuous Delivery! At the end of the day, you want your Machine Learning and Deep Learning models to be part of a software system, so you still have to follow Software Engineering Best Practices alongside Machine Learning Best Practices!
Sagify is open sourced by the team behind Kenza, one of the winners of Product Hunt Global Hackathon. We are currently working on a Continuous Integration solution for Machine Learning on top of open source projects! Sagify is one them as we believe in openness, sharing and passion for Machine Learning!