For smaller workloads, serverless platforms such as AWS Lambda can be a fast and low-cost option for deploying machine learning models. As the application grows, pieces can then be moved to dedicated servers, or PaaS options such as AWS Sagemaker, if necessary.
Although it is instructive to first use Lambda by uploading code directly, it is best to select a framework so that you can leverage additional integrations, such as API Gateway with AWS. For this example we will use TensorFlow as the machine learning library, and so we will look for frameworks that can deploy Python applications.
Zappa is well known for being able to easily deploy existing Flask or Django apps, however since we are creating this with serverless in mind from the start we will select the ubiqutous and powerful Serverless framework.
When treating infrastructure configuration as a first-class citizen it is advisable to first create a shell of the application and deploy it, and then write the actual code. This allows for rapid iterations that are close to the end-state, and avoids costly surprises down the road.
Structuring the project
For machine learning most of the work can be categorized into three critical steps:
- Retrieving, cleaning, and uploading the input data
- Training the model and saving the results
- Inferring (i.e. predicting) a new result based on a new set of data
At its core, designing for serverless platforms means thinking of how to segment your code by individual deployable functions. In light of the categories above, we will structure our project like so:
│ ├── upload.py
│ ├── train.py
│ └── infer.py
Be sure to also create a new virtualenv:
$ pyenv virtualenv 3.6.5 tflambdademo
$ pyenv activate tflambdademo
Adding Lambda handlers
A “handler” is the term used for the function that will actually be invoked by Lambda, and is always called with two parameters,
context. From the docs:
event– AWS Lambda uses this parameter to pass in event data…