Machine Learning with DigitalOcean’s New Serverless Functions
Yesterday, DigitalOcean released their new serverless functions capability for public use along with some examples from their closed beta. I am writing this article to help you get started as you try out machine learning projects on this new computing approach.
The new DigitalOcean functions work similarly to AWS lambda and azure functions. They have very low compute, storage, and RAM abilities, but they are cheap, fast, and highly scalable.
I’ve boiled it down to 10 easy steps, and put it all into a jupyter notebook in Google Colab, just for you at this link.
The example I prepared for this article is a regression example. Specifically, a machine learning model is trained to predict an equation that predicts the value of a target variable based upon a multivariate dataset. The details of this regression approach are described in more detail in this past article.
The ten steps to getting your machine learning model going are all from the command line:
- Install doctl
- Authenticate to DigitalOcean using doctl
- Install serverless in doctl, so we can make functions
- Connect to your DigitalOcean sandbox
- Create a DigitalOcean python function from a template
- Create a requirements file
- Fit a regression model on some data
- Create a build script
- Create the inference pipeline
- Build, deploy, and test your DigitalOcean function
What you end up with is a nice API that can be called from the command line like this:
You can also call the model from Postman or another notebook like you usually would:
Unfortunately, most machine learning models will not be appropriate for serverless deployment with functions. This is because machine learning libraries like to use GPU (which is not available in this instance type), occupy plenty of RAM (which is limited for functions), and take lots of time to install libraries (which is not allowed for functions). I was able to get numpy and joblib working within functions, but TensorFlow, PyTorch, onnx, xgboost, and others failed to load into functions as dependencies. This is by design. Functions are meant to be transient and quick to pop in and out of existence. Big ML models scale very differently than this. For these reasons, small/tiny models are a good fit for functions, but even medium-sized ML models are not a good fit for serverless functions on DigitalOcean.
Running out of memory or compilation time generates a wide variety of errors such as the following gems.
Please have a look through the code and try it out in your account. Feel free to mess around, like renaming the project to your own project name, and putting in your own regression model. The code has a convenient ability to pop open right in Google Colab.
Special thanks to the DigitalOcean team for inviting me to the closed beta, their slack channel digitalocean-beta, and for responding quickly to my questions.
Until next time!