Object Detection with MXNet Scala Inference API

Qing Lan
Apache MXNet
Published in
3 min readJun 7, 2018

Author: Qing Lan

With the recent release of MXNet version 1.2.0, the new MXNet Scala Inference API was released. This release focuses on optimizing the developer experience for inference applications written in Scala. Scala is a general-purpose programming language that supports both functional programming and a strong static type system, and is used with high scale distributed processing with platforms such as Apache Spark.

We recommend you check out the other posts in our MXNet Scala Inference API series, if you haven’t already seen them:

And in this last post, we work through an example of using the MXNet Scala Inference API for Object Detection. This model is used to identify objects and their locations in an image. It is also known as a Single-Shot Detector (SSD) model. You can learn more about Object Detection here.

Similar to image classification example in our previous post, you need to define the following paths. You can download our sample model using script here.

val modelPathPrefix = "/absolute/path/to/your/model"
val inputImagePath = "/absolute/path/to/your/image"

Then prepare the setup of the model.

val dType = DType.Float32
val inputShape = Shape(1, 3, 224, 224)
val inputDescriptor = IndexedSeq(DataDesc("data", inputShape, dType, "NCHW"))
val topK = Some(3) # Number of results
val context = Context.cpu()

After this initial setup, you load the image with ImageClassifier and it will be pre-processed for you.

val img = ImageClassifier.loadImageFromFile(inputImagePath)

Then you initialize a new ObjectDetector using the model location, its description, and execution context.

val objectDetector = new ObjectDetector(modelPathPrefix, inputDescriptors, context)

After this step, you can use ObjectDetector for inference on the input image. In this example, you will get three results.

val output = objectDetector.imageObjectDetect(img, topK)

The default output is a bit hard to read, so the example provides this code to help clean it up. If you use the following image, you will see a similar output as shown after the image.

Class: car
Probabilties: 0.98847263
Coord:312.21335,72.0291,456.01443,150.66176
Class: bicycle
Probabilties: 0.94833825
Coord:155.95807,149.96362,383.8369,418.94513
Class: dog
Probabilties: 0.8281818
Coord:83.82353,179.13998,206.63783,476.7875

This is the generated image based on the result:

Call for Contribution

If you’re a Scala user, and you like where this is going, join the project, provide feedback, or pitch in on a feature you want to see. As an open source project, these great features are free to use, and are influenced and improved by open source community’s involvement. We are actively developing the training features with Scala with new Type-safe APIs and creating more examples using Scala on MXNet. Also, make sure you follow Apache MXNet to kept posted on new features.

--

--