Ready-to-Use Deep-Learning Models

Getting started with open source model assets from IBM

--

Have you ever wanted to classify images, recognize faces or places in images, process natural language or text, or create recommendations based on time-series data in one of your applications? With deep-learning (machine learning using deep neural networks) you can do this, and much more.

Object detection at work. Photo by Alexis Chloe on Unsplash https://unsplash.com/photos/dD75iU5UAU4

In order to apply deep learning to your data (text, images, video, audio, …), you need a pre-trained model, a runtime environment, ready-for-processing input and likely some post-processing logic that translates the model output into the desired artifacts.

Let’s take a brief look at the steps you’ll generally have to complete to consume a deep learning model:

  • Obtain a trained deep learning model that suits your needs. Deep learning models tend to be (very) large and (very) complex, and some are even not yet well understood. Training these models is often time and resource consuming, requires lots of data and a healthy dose of ML expertise, and knowledge of frameworks such as TensorFlow, Caffe, PyTorch, or Keras.
  • Implement wrapper code that pre-processes the inputs, invokes the framework to produce the model output and converts the output into an application-friendly format.
Pre-trained model + wrapper = deep-learning ready for application use

To make this process simpler for developers like you and me we’ve set up the Model Asset Exchange (MAX). The exchange makes ready-to-use deep learning models available that have been well tested, are free to use and include provenance.

Assuming you have Docker installed on your local machine (or can deploy a docker container to the cloud), you’ll have a basic ready-to-use deep learning “service” running in less than five minutes.

Note: The model asset repositories also include a starter configuration file for deployment to Kubernetes.

Getting started

Choose the desired model from the MAX website, clone the referenced GitHub repository (it contains all you need), and build and run the Docker image.

Note: The Docker images are also published on Docker Hub.

For example, if you want to annotate images with a caption, describing what is visualized, choose the image caption model and run the following commands in a terminal window:

$ git clone https://github.com/IBM/MAX-Image-Caption-Generator.git
$ cd MAX-Image-Caption-Generator
$ docker build -t max-im2txt .
$ docker run -it -p 5000:5000 max-im2txt
...
Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)

Once the container is running you can use the exposed REST API to explore the Swagger specification or utilize the model.

Docker containers provide all the functionality you need to explore and consume deep learning models from the Model Asset Exchange.

Open the displayed URL in your web browser , e.g. http://localhost:5000, to access the Swagger specification and review the available API endpoints. Note that the number of API endpoints varies by model; e.g. some Docker images may expose a training endpoint, whereas others do not.

Note that the service hides a lot of the complexity. For starters, you don’t have to be proficient with the framework that is used to operate on the model. You also don’t have to convert your input into something the framework understands or convert the model output into an application-friendly format.

Test-drive the API

The quickest way to test-drive the service is through the generated Swagger UI. Provide the requested input (in this example the location of an image) and send a prediction request:

Running a quick test using the Swagger UI. Note the low probability in this example; the generated image caption might not accurately reflect the image content.

Or, use your favorite command line utility or API development environment to send the request, providing the required input:

$ curl -F "image=@t.jpg" -X POST http://localhost:5000/model/predict

If the request was processed successfully, a model-specific JSON response is returned that your application can consume as desired.

Consume the API

To leverage the service, invoke the desired REST API endpoint, providing the required input(s). For some of the models we’ve created a sample web application, such as this Python app for the image caption model. To find out whether a sample application exists for the model you are interested in, refer to the model asset’s README file in GitHub.

Annotate images with a description of their content.

A couple of parting thoughts:

  • Keep in mind your data is unique and models can produce unexpected results if the data they were trained on was very different from yours. One size does not fit all. You might have to sometimes train models using your own data to achieve an acceptable accuracy.
  • The model assets are provided as is. Refer to each model asset README for details about its origin, training data set, licensing terms etc.
  • You are welcome to customize the docker images to suit your needs. There are many ways they can be enhanced. Our goal is to provide a foundation that you can build on. So if you would like to restrict access using an API token or require a different output format go ahead and hack away.
  • If running a docker image is not a suitable option for your use-case scenario stay tuned. As Maureen McElaney outlined in her blog post, we have started to look at other ways to make these models available in a browser near you.

Curious if MAX has already what you need? Take a look …

--

--