Deploy it: Using Heroku to continuously build and deploy a deep-learning-powered web application.

7 min readJul 29, 2019

We’re not going to cover training a deep neural network in this article. If you need an intro to the DNN/ML/Fastai basics you could check out: Clean it, Train it, Deploy it, Make ML, Harder Better Stronger Faster*.

I have trained models in the past, experimented with deep learning frameworks/libraries like Keras, Tensorflow, Fastai and PyTorch. I’ve found that you can spend endless time gathering data, testing architectures (If you’re getting adventurous: designing your own architecture or adapting academic papers) training different models.But often at the end of the process all you have for your efforts is a number (a set of numbers if you’re particularly lucky) and the vague promise that — this could useful one day.

What has prevented me in the past from moving from trained-model to deployed application is a perceived learning curve. I don’t have any background in web design or web technologies and although I’m sure I could learn, I would rather spend that time working on other parts of the process. It turns out it is not actually difficult at all. In an afternoon(ish) I took the results from Part1 and created my latest master piece hand-or-foot:

Special thanks to Me for the graphic design. 1999-noir. — Another web application solving a problem nobody had.

We need to export our trained model + some extra metadata, we are going to use this to run inference on the data supplied by the users of our app.

learn.export()

This will produce a file called ‘export.pkl’, keep it handy, we’re going to need it later.

2. Code up a webpage.

I’m not going to admit to cheating, all I’ll say is that my code ended up looking suspiciously like the Cougar-or-Not project (but worse) with an honourable mention to Fastai Model to Production. The minimum viable product will want to be able to accept a HTTP POST carrying the user supplied image and be able to return the calculated prediction.

classes = ['foot', 'hand']
defaults.device = torch.device('cpu')
learn = load_learner('models')      #path containing export.pkl@app.route("/upload", methods=["POST"])
  async def upload(request):    
    data = await request.form()
    bytes = await (data["file"].read())    
    return predict_image_from_bytes(bytes)def predict_image_from_bytes(input_bytes):
  img = open_image(BytesIO(input_bytes))
  pred_class, pred_idx, losses = learn.predict(img)
  return JSONResponse({"prediction": str(pred_class), "scores": sorted(zip(learn.data.classes, map(float, losses)), key=lambda p: p[1], reverse=True)})

I’m not going to go through exactly what and how for this section because my work is heavily borrowed from others, you can get the original explantation from the source, what I will do is explain how I got everything working on Heroku, there are a few spots where I got stuck that I can navigate you through. The other choice is to go self-guided and jump into the code available here.

3. Get your code onto the net

I chose Heroku. Heroku describes itself as a Platform as a Service (PaaS) and if for no other reason it should be boycotted. Unfortunately it turned out to be perfect for my use-case; you get a free instance that can run containers up to 500MB, it interfaces with GitHub and can be configured to automatically build and deploy code when new commits are made (CI/CD goodness). Although Heroku is working perfectly well, given the time again I would probably choose to use one of the larger providers of cloud services (AWS, GCP, Azure) because knowing how to use those services seems like a more useful skill long term.

If you decide to go with Heroku their getting started guide is great and I recommend following all the steps here, then coming back to this article when you get stuck (because you will — there are a few idiosyncrasies).

4. Once you have your code working locally I recommend pushing it into GitHub, then linking Heroku with your GitHub account.

I gave up on having multiple branches and just ended up connecting master straight into Heroku. Dev is Prod, the future is now.

Continuous Deployment — Mostly for bragging rights, but we can also easily automate time-consuming steps.

5. Finished! If everything worked. Mine didn’t. We want it to look like this:

v14: Embarrassing; please do better than this

[ISSUE] Slug size > 500 MB

Our slug must be below 500 MB, if yours is too big, you’re including too much stuff in your project. Is it all necessary? It turns out the base fastai install is pretty huge, but that a lot of that bulk comes from the CUDA GPU support. Heroku doesn’t offer a GPU and inference works fine on CPU anyway, so modify your ‘requirements.txt’ file to include the following:

https://download.pytorch.org/whl/cpu/torch-1.1.0-cp37-cp37m-linux_x86_64.whlhttps://download.pytorch.org/whl/cpu/torchvision-0.3.0-cp37-cp37m-linux_x86_64.whlfastai

Make sure you go to the PyTorch homepage, scroll down to ‘Quick Start Locally’ and select the options that suit your environment. I chose ‘stable, linux, pip, python 3.7, CUDA=none’ and PyTorch will give you links to the .whl files you should be including in your requirements file.

[ISSUE] ImportError: libcudart.so.9.0: cannot open shared object file: No such file or directory

This caused me angst.

I had my app working perfectly well locally, it was building successfully in Heroku but then the logs showed the application crashing during runtime. Libcudart.so.9.0 is the CUDA library, we should not even need this because we are running inference on CPU only. That was the whole point of the previous section. Honestly the answer was infuriating. Heroku uses some sort of caching between builds to save resources. When I first tried to build the app, the slug size was too big, so I introduced the CPU only requirements to prevent fastai from loading the large, unneeded libraries. It turned out that there were some remanent references floating around in there (probably by design — I just didn’t know it worked like this, I assumed each build was from scratch).

If this happens to you, try creating a ‘runtime.txt’ file in the base of your application working directory. Heroku was defaulting to python 3.6.x and I wanted python 3.7.x, this requires only a single line in the runtime file:

python-3.7.3

Changing the run-time causes a flushing of the build cache so adding this line fixed my libcudart.so.9.0 error.

If you are tempted to think: ‘It says a file is missing, why don’t I just add it’ that would be the wrong approach. The installation process for the CUDA toolkit is non trivial, if you decided to fork the Heroku buildpack for python and modify it to get the dependencies into the slug you’re going to have a bad time (rather, there is just a significant list of things to learn and when I find myself getting distracted like this it can be hard to pull back, but if you go down every rabbit-hole you’ll find you make slow progress).

[ISSUE] Web process failed to bind to $PORT within 60 seconds of launch

In server.py

if __name__ == '_main__':
    port1 = int(os.environ.get('PORT', 5000))
    uvicorn.run(app, host='0.0.0.0', port=port1)

If you pick a port and try to bind to it, it will fail because Heroku will allocate a port at runtime. You need to use the os.environ.get() command to retrieve the selected port from the $PORT environment variable. I’ve included a fallback port of 5000 which is used if $PORT isn’t set, this makes it a lot easier to test locally.

In theory. That was the fix suggested by the internet. It didn’t work for me probably because a little knowledge is a dangerous thing. If you find it doesn’t work for you either, then the issue is most likely with your Procfile (case sensitive!). The Procfile contains the command used by Heroku to start your application, by feeding the $PORT value in here I solved my port binding issues:

web: uvicorn server:app --host 0.0.0.0 --port $PORT

Working Application:

http://hand-or-foot.herokuapp.com/

Give it a moment to spin up, the free tier hibernates the application, so if it hasn’t been used for a while it will take a little bit to respond (either that or the whole thing is dead and will never work again in which case you just need to take my word that it was working, robust, stable with LTS).

Misc and Wrap-up

I’m keen to explore other lightweight ways of deploying ML models for inference please reach out and let me know if there are good methods you feel strongly about.

If you skimmed this far, I highly recommend you go away and do everything here yourself, from start to end. I spent a while reading other peoples stories and approaches but it wasn’t until I jumped in and tried it for myself that I really start to my consolidate knowledge and skills (there is sometimes a gap between what we think we know and what we can demonstrate we know). No doubt I ran in to problems I haven’t documented, and you will too, one of the most important things you can learn/develop is persistence. Good luck!

FAQ

Q. What Next?

A. My model is still pretty rubbish, I have found examples of feet that are obviously feet that it still gets wrong. Why? I want to find out, examine my dataset more closely for bias, make sure it’s clean and that I’m using all the latest training tips to get the most out of the little data I have

Q. So more hands and feet?

A. Yeah — I think there is value in getting this model working as well as possible, then when I tackle a more complex problem I will have built up a better toolkit to hit it with.