Inception as a Service

Roger Stark
Nov 20, 2017 · 3 min read

OK, I admit the title is a little bit cliche, but following the previous blog about image recognition with Owl, I would like to briefly introduce the API that the InceptionV3 service provides to programmers.

Prerequisite

Please refer to the previous blog for installation of OCaml, Owl, etc. If you have not tried Owl yet, I recommend using the Docker image of Owl. As a prerequisite, please make sure that the tool ImageMagick is installed:

sudo apt-get install imagemagick

Besides, prepare one image on your computer. It can be of any common image format (jpg, png, gif, etc.) and size. If you’re not sure which image to use, here is one choice (the panda image from previous blog):

wget -O panda.png https://goo.gl/dnyjh7

API

infer: Service that performs image recognition tasks over client images. It accept a string that specify the location of a local image. Its return value is a 1x1000 N-dimension array, each element is a float number between 0 and 1, indicating the possibility that the image belongs to one of the 1000 classes from ImageNet.

to_json: Convert the inferred result to a raw JSON string.

Parameter:

  • top: an int value to specify the top-N likeliest labels to return. Default value is 5.

to_tuples: Convert the inferred result to an array of tuples, each tuple contains label name (“class”, string) and the probability (“prop”, float, between 0 and 1) of target image being in that class.

Parameter:

  • top: an int value to specify the top-N likeliest labels to return. Default value is 5.

Example

Here is a simple example to use this API:

The code is simple: 1) deploy InceptionV3 service to your machine with Zoo, 2) choose an image on your machine, and 3) image recognition.

To satisfy different user requirements, the service supports 3 kinds of returned values. By default, the returned value is a 1x1000 Ndarray in Owl:

val labels : Owl_algodiff.S.arr =

C0 C1 C998 C999
R0 2.72124E-05 2.91834E-05 ... 3.38798E-05 5.04353E-05

By using to_json , you can get the top-N inference results in a raw JSON string:

[
{
"class": "giant panda,
panda,
panda bear,
coon bear,
Ailuropoda melanoleuca",
"prop": 0.961965441704
},
{
"class": "lesser panda,
red panda,
panda,
bear cat,
cat bear,
Ailurus fulgens",
"prop": 0.00117377145216
},
{
"class": "space shuttle",
"prop": 0.000592212367337
},
{
"class": "soccer ball",
"prop": 0.000403530168114
},
{
"class": "indri,
indris,
Indri indri,
Indri brevicaudatus",
"prop": 0.000263019668637
}
]

You can also get an array of tuple using to_tuples :

val labels : (string * float) array =
[|
("giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca", 0.961965441703796387);
("lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens", 0.0011737714521586895);
("space shuttle", 0.000592212367337197065);
("soccer ball", 0.000403530168114230037);
("indri, indris, Indri indri, Indri brevicaudatus", 0.000263019668636843562)
|]

Compare with Google API

With this API, a user can write code that’s quite similar to that using Google Vision API:

import io
import os

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

# Instantiates a client
client = vision.ImageAnnotatorClient()

# The name of the image file to annotate
file_name = os.path.join(
os.path.dirname(__file__),
'resources/panda.jpg')

# Loads the image into memory
with io.open(file_name, 'rb') as image_file:
content = image_file.read()

image = types.Image(content=content)

# Performs label detection on the image file
response = client.label_detection(image=image)
labels = response.label_annotations

print('Labels:')
for label in labels:
print(label.description)

Of course, there are still a lot to be done to enhance the current API, such as support for more features and RESTful request/response (which will soon be introduced in further posts). Current service also relies on ImageMagick to convert the provided image to ppm format, which will be stored in the same directory as the source image — it may not be an ideal solution in some cases. But I hope this example has already shown some of the design principles we keep in mind: expressiveness and usability for users.

The code of the InceptionV3 service is on the Gist. And here is a web-based demo of this image classification service powered by Owl. Please feel free to check them out!

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade