Learn What Lies Beneath Our Ready-to-Run Deep Learning Models
A Set of Pre-Trained Models That Can Make it Easy for You to Integrate Machine Learning Into Your Web Applications.
Data science is awesome and machine learning is the wave of the future! Extracting a solution from data to solve a business problem makes you feel like a wizard. But, no one is born a data science wizard. You have all the magical tools around you and all you need is to learn how to use them. The Model Asset eXchange (MAX) is one set of tools that will help significantly in your journey to becoming a data science wizard.
So, What is The Model Asset Exchange (MAX?)
The data science life cycle has five components: data extraction, data cleaning, data exploration, modeling, and deployment. Let’s say, you want to create an application for detecting cats and dogs inside an image. You need to go through a time-consuming process of collecting (and classifying) images of cats and dogs, cleaning the collected data, finding a model that can perform object detection, and training it.
To make your life easier, we’ve established MAX. It’s a one stop shop for open source deep learning models, covering common application domains, such as audio, image, text or video processing. When you pick a model from MAX, you:
- can be sure that the model code and intellectual property (IP) have been vetted and tested
- can deploy the model as a microservice in minutes
- can consume the model using a simple REST API
Looks fascinating, right? Now, I am going to take you through what’s happening behind the scenes.
What it Takes to Create a Model in the Model Asset Exchange (MAX)
Before getting into the intricacies, I feel that process can be better understood when it is visually diagrammed. So, here is the big picture of process behind creating MAX assets.
Let’s walk through what it takes to publish a ready-to-use object detector model on MAX. The tasks that I’m describing this post are not specific to any model currently on MAX.
Phase 1: Model Literature Search
This phase starts with the literature search to understand the methodology and trade-offs between various open source deep learning models for the application domain of interest. From the ocean of deep learning models we take our pick from a set of widely trusted and cited models. There is no framework restriction. By this I mean, models can be implemented using any deep learning framework, such as TensorFlow, PyTorch, and Keras.
Object detection is one of the deep learning models with interesting functionalities. The search results below gives you a glimpse into the number of deep learning models available for object detection.
Take note, we got 406,000 results! After careful examination into the quality and performance of widely trusted models, I’ve decided to go with SSD Mobilenet V1 object detection model for TensorFlow.
Phase 2: IP Evaluation
Shortlisted models are subjected to IP evaluation. In this phase, we evaluate licensing terms for the model code, model weights, training data set, and supporting scripts. We prefer to use subject material that is available under a permissive open-source license (e.g. Apache 2.0, MIT, BSD, CC0, etc.). In some cases, pre-trained models may be released under permissive licenses, even though the underlying data set may have a more restrictive license (e.g. pre-trained ImageNet model).
The TensorFlow implementation of SSD Mobilenet V1 object detection is published under the Apache 2.0 license.
Phase 3: Testing and Code Cleanup
This phase is critical as we dive deep into the intricate details of the chosen model by analyzing the code. The goal here is to transform research level code into production ready code. Imagine the model as a straight pipeline. Now we are going to attach a head and tail to this model pipeline, so that data can be sent, and prediction can be received conveniently.
- Figure out input format that is acceptable by the model. Say, some models that deal with images demand input image to be of specific size. (e.g. 64 * 64)
- If necessary, implement pre-processing code that transforms raw input data into a model-compatible format.
2. Model code health check
- Debug the model code and remove unwanted or unusable parts of the code.
- Approach the repository owner for additional information about the model or bug fixes.
- Determine how to test the model, if no instructions were provided.
- Test the model with our own data to get better understanding of model response.
- Provide feedback to the repository owners if there are any performance issues.
- Provide solutions to either improve the model or fix any bugs.
- Implement code that converts the model output into an application-friendly JSON format and optionally applies filtering etc (tail part as per the above diagram).
Phase 4: Wrap pre-processing, model processing, and post-processing code
Our framework for ready-to-run (deployable) MAX models is written in Python. It exposes a RESTful API, which typically includes at least three endpoints: a metadata endpoint (describing the model), an inference endpoint (invoking the pre-processing code, the model execution code, and the post-processing code), and a Swagger endpoint (describing the model endpoints).
We package and compress model artifacts (such as the persisted model graph and weights) and store them on cloud storage.
A Dockerfile takes care of the assembly when the Docker container image is built:
- the model artifacts are downloaded, extracted, and validated
- the framework code is copied
- the prerequisite libraries are installed.
The image below gives you a glimpse into how the wrapped object detection model looks. The image on the left is the one used for testing. The image in the middle is the inference endpoint where the user can test the image. The image on the right is the JSON response of the model.
Phase 5: Review, Continuous Integration, and Version Control
Once a model has been wrapped, we enter the final phase.
First, we run the model on the test images we used in “Phase 3: Testing and Code Cleanup” to check if same responses are being generated. The goal here is to verify that no regressions were introduced.
Next, we create a public github repository, document the model metadata, references, licensing information, and create deployment instructions for Docker and Kubernetes.
Last but not least everything is peer-reviewed: the model code, the wrapper code, and the supporting documentation.
Once approved by all reviewers, the Docker image is published on Docker Hub and the model is released on the MAX website.
What happens after a model was published
We strive to keep models current and therefore monitor the source repository for any updates. When there are updates, they are tested and added to the MAX repository.
For some models we also create sample applications to illustrate common use-cases.
Artificial Intelligence is for Everyone
AI is for everyone and we strongly believe in making it available to everyone. That’s why we handpick every single model in the Model Asset Exchange and make it available in a way that you can create your own AI application in minutes.