Industrializing AI : How to move from lab to factory #Enterprise AI

More often than not the literature and media coverage about AI is limited to the sexy stories — the algorithms, the use cases and million dollars in savings. Off course in the current lifecycle of AI no one would like to talk about the dirty details, nuts and bolts (leaky faucets and plumbing) of how to build a solution which is end to end and complete.

What follows in this article are again notes from field, distillation of our learnings and experience working in this field; discussion with thought leaders and programmers and businesses.

This may be best described as a how-to for “Implementation first approach for an AI solution” or summary of what it takes to build an AI solution in “laboratory” and deploy it in an “industrial” or “factory”.

Working in the “lab”

Lab is where data science happens or at least most of it. A data scientist (they are the ones who help a computer identify the pattern and create a machine learning model) teaches a computer to establish a relationship between having jaundice and symptoms on hand like color of nail, skin and so on. This kind of model is able to answer questions like is this kid having jaundice and how sure the model is (like 93.2%)?

Data scientists are the guys often translating the academic research into algorithms which are the heart of any AI solution. Python and R are two major choices with which Data Scientists work; they are good at statistics and programming. For now we can skip the debate about tools and skillsets; there is almost an endless stream of innovation in that area.

Let us take an example of building a predictive model. At a very broad level the steps are

  • Identification of suitable historical data
  • Preprocessing — this step involves preparing the data in a format suitable for applying algorithms and identifying which data points are to be used for learning. This may includes steps like normalization, dimension reduction, image processing and so on. Another important task is to segregate the data into training and test data sets. Typically 60–70% of time gets spent at creating the right data set to use in subsequent steps.
  • Learning — involves selection of suitable techniques (or multiple techniques) on training data so that a model is created. Terms like supervised learning, unsupervised learning etc. are used in this context.
  • Prediction — the train data is used and predictions are generated.
  • Error Analysis — Using the test data the models are tested for accuracy. The objective is to measure the accuracy of the model. Precision/recall, over-fitting, test/cross validation are terms usually used in this context.

Another view of the “lab” process is as shown below:

AI Model Development Process

The key point is at the end of the exercise the model is ready for predictions.

So we have a model now? What can we do with it?

There are two broad cases where this model can be used:

  1. Ad-hoc predictions — I guess this is just an experiment in that case or a proof of concept. This is a stand alone program which probably will not go to the “factory”
  2. Factory application — this model is to be incorporated in an existing piece of software or in a new piece of software. This is the real world use case ready for deploying into a business process and production.

Moving from “lab” to “factory”

Once the data science team is convinced that the model is ready — the same needs to be deployed into production environment or industry or factory. Usually, the initial idea of creating a prediction model is to embed it into a software/app where it could be used to predict the outcome or to create a new app/solution which will embed this model. This is the forte of the data engineers and application developers.

Some of the key questions from the architecture perspective are:

How is the model going to be used? Will it be used in real-time in low-latency application or in a batch mode?

For example, for our jaundice example — will there be an application (mobile camera app) or a dedicated scanner which will take picture/video of the hand and then use the model to make a prediction. If yes, this is a real-time usecase.

Suppose data for many such kids is collected and then handed over to a program which predicts jaundice using the same model. If yes, the example qualifies as batch mode offline use case.

How is the model going to be deployed?

There are multiple options here and it actually depends on your use case. The key requirement is easy embedding so that existing applications needs not extensive changes and development teams are able to do it easily and quickly.

Microservices (and scalability along with numerous other advantages) are an easy option — imagine your models (irrespective of the language they were developed in R or python) having the flexibility to be deployed as an API. With the microservices architecture and exposing the model as API the application developers can seamlessly integrate as if they are integrating with any other web API.

One solution fits all? Not really. It is also not uncommon for developers to take the implementation in R or python and implement it on another framework of their choice (for e.g. Apache Spark). The driver for this kind of approach is that this kind of rewrite creates an agile and scalable deployment. The downside is every time the data scientist reiterates or create the underlying model — the deployment process becomes longer.

For us microservices model and these RESTful containers have been really working very well and allowing us to shorten the deployment cycles. Agile, quick, easy, scalable with clear segregation of responsibility and roles.

How to improve the AI solution?

Each query and decision suggested by the model needs to be captured and fed back. This allows scoring the performance of the model and allows room for the AI solution to evolve. This is a critical process without which the true value of the solution may not be leveraged.

Additionally, as more and more data is available (think of this as new training data being available) the re-learning process for the model needs to happen. This continuous and regular training allows the model to be relevant and grow.

Is there something else?

Off course yes. From an approach perspective following are important pointers to guide your deployment (and if you are looking for more check this)

  1. How will you measure performance ? Establish performance requirements — key indicators which will help you track things like performance accuracy, latency, base lines from the lab and so on. This ties up with the improvement approach suggested above.
  2. What if you want to go to a previous version? Handle coefficients as properties — outside the code and version controlled. Serialize them.
  3. How will you test the model before deployment? Automated and integrated validation strategy — think of a validation suite to ensure that your model is working fine. This is age old wisdom anything which is not tested will fail. Do I need to call out that this must be automated and part of your deployment process? Keep on adding new cases as new unhandled cases emerge in production
  4. How will you back-test and forward-test? Let us for a second assume we are building an algorithmic trading platform based on AI. Backtesting allows users to use the platform to historical data to see how the system would have performed.Forward testing is the next phase of evaluation, and provides users with an additional set of out-of-sample data on which to test the system. In trading terms this is sometimes called paper trading, forward testing is the simulation of actual trading. The system’s logic is applied to a live market, but all trades are executed on paper only — trade entries and exits are recorded, but no real trades are initiated. This kind of testing forces requires data to be maintained smartly so that backtesting and forward testing may be performed.

Following picture shows how an “industrialized” or “factory” ready solution will look like:

How often should you re-train and enhance your model?

I am afraid I don’t have an answer. The good news is I can suggest an approach. Keep an eye on the performance of the model and start with a fixed period (once a month or twice a month) — usually the initial days have more cycles as the model settles down and it gradually kind of settles.

Create a team of your data science and application developers and review at a higher frequency (may be daily or at least once a week) and then gradually reduce if all is well. Unless you see a dramatic decline in the performance stick to plan. Human in the loop also plays an important role here and the model should not be expected to do everything by itself starting with day 1.

To summarize, crawl-walk-run may be the best words to generalize the approach. Do a regular review and ensure that model-retraining and enhancements happen periodically.


Running an end to end AI enables solution requires an underlying understanding of not just data science or Artificial Intelligence. The architects must be comfortable with Big Data and to some extent with enterprise applications . This end to end perspective is must to leverage real advantage of the AI.

Additionally, the laboratory and factory are two different environments; yet with the approach suggested above it is possible to move efficiently (both in terms of scale and agility). With the right kind of architecture this kind of orchestration is possible provided this is not an “afterthought”. We at Crisp Analytics call this “implementation first strategy”.

Let us know your thoughts.

References and further readings:

A Framework for Machine Learning Competitions []

htt p://