Productionising Machine Learning Models

Published in

Trade Me Blog

6 min readJul 24, 2017

One thing I’ve noticed recently, is the style of Machine Learning tutorials, and how more often than not they leave you with a last step somewhere along the lines of:

and now you have your finished model, and you’re done…

But you’re not really done after that. What if you want to expose this model to friends or even potential customers? How do you go about making your awesome model accessible to others, in an easy-to-follow way?

One of the main reasons most tutorials won’t go beyond this step is because — productionisation is hard.

What is productionisation? Well I’m not referring to the original industrial meaning, but rather to the process of taking a piece of software and making it robust enough to be used in production. In the case of software, production refers to a real world environment where real users interact with your product or service.

What determines that required robustness depends heavily on how you intend your model to be used. What kind of system you want it to integrate into, how many users will be using it, etc…

So there isn’t really a generic catch-all solution. But there are some good practices!

A good start to productionising your model is treating it like a software service. It will likely have inputs, outputs and a main function that gets executed when you run your model.

Following from that, it is a good idea to wrap your service inside an API. If you don’t know what an API is or why you would want to use one, I would suggest looking that up — and then coming back here to pick up where you left off ;)

Depending on the programming language you used, there may be some existing libraries out there to easily wrap your model as an API service.

At this point you could copy the necessary files to a production server and just run the API service from there. But in the majority of cases this actually won’t be robust enough for production.

You should consider all of the aspects of a solid production environment and figure out where you want to sit on the scale with your service.

A good production environment, even a small scale one, will have an answer to all of these questions (and more):

How do I log the inputs and outputs of my service?
How do I monitor if the service is operating normally?
How do I guarantee my service is secure?
Can my service handle the anticipated capacity at all times?
Can I scale the my service if I need to handle more capacity?
How will I test my service?
How will I deploy my service?
Does my service need to be up all the time? What happens if it is down?
How does my service operate with other systems?
How performant is my service and how do I measure it?
How reliable is my service?

So there are a bunch more considerations to take into account…

To easily deploy your service, another good practice is building it into container technology, such as docker. Using this kind of technology you can build & test an entire environment with your service in it. This will produce an image which you can run on any other machine running the same container technology, i.e your production server.

Depending on how often your service API will get called, a single machine may not be able to handle the load and ideally your service should always respond within an acceptable time.

You may need to look into load balancing and horizontally scaling. This is where you have multiple copies of your service running (potentially on multiple machines) and a simple load-balancing server will direct all incoming traffic to an appropriate service. This is made a lot easier if you use container technology, since you can quickly install additional copies of your service.

Another option of dealing with load is vertically scaling a machine so that is has more resources to accommodate more load. This is more a one-off action and generally more difficult as it would likely require re-configuring the machine.

Securing your traffic is rarely not important, but is easy to forget or not do properly. At any scale, security should be one of your main considerations. A good practice would be to host all your services securely through HTTPS (and deny HTTP) as a start. But securing a service is a whole field in its own right, so I won’t go into much more detail here.

Logging your service’s inputs & ouputs becomes really useful when you have no direct way of getting feedback from your users and becomes very important for diagnosing or debugging what your service is producing. There are multiple logging practices and tools out there, from just logging to a file to sending logs away, or a logging aggregator. It really depends on the scale of your production environment.

As soon as you need to know the state of your service and how it is operating, you will need to think about monitoring, which is basically a process for constantly determining the health of your service and giving you a live status. You really don’t want to be blind if you can help it.

Then you should also think about how you want to test your service. You may want to test for various input types and input edge cases. You may want to do performance testing, or even integration testing if your service is part of a system.

How will you handle maintenance to your service, or if it goes down due to errors? You could set up redundancy plans, where another backup service kicks in if this is the case. Or you could make sure your larger system can handle down time of your service.

How will you handle a bad deploy? Or if your model suddenly consumes all of the available machine resources due to an issue? You’ll need to look into applying constraints to your environment. Again, using container technology makes this really easy.

One of my favourite tests is to build a 100% in-efficient service that takes over all the CPU, memory and hard drive resources it can.
You then release that into production and see how your system behaves. If everything else still operates optimally, you’ve done well.

As the scale of your production environment grows, you will actually start entering the realm of micro-services and service-orientated architecture — two quite substantial styles of software development. It is at this point you probably want to fully understand these areas or talk to people who do. Maybe even have an expert on your team if you are thinking of doing this for a business.

So you can see, there isn’t really a simple way to productionise something without going through a bunch of considerations first. You’ll need to figure out what exactly is required of your production environment and what may be overkill for your particular scale.

My advice is: if you want to productionise your model, make sure you consider the best practices of a good production environment and make sure it is appropriate to the intended scale of your service. But most importantly — make sure you do your research.

Productionising Machine Learning Models

Written by Joris Coppieters