Build and deploy ML models using Maximo Visual Inspection

Published in

IBM Data Science in Practice

6 min readMar 21, 2023

Deep learning models built using Maximo Visual Inspection (MVI) are used for a wide range of applications, including image classification and object detection. These models train on large datasets and learn complex patterns that are difficult for humans to recognize.

This may be a daunting task for a non-data scientist or a data scientist with little to no experience. The best part about this tool is that it offers a user-friendly low-code/no-code user interface to train and deploy complex models with ease!

This article will walk you though how to approach deep learning modeling through the MVI platform from data preparation to your first deployment.

The figure above depicts the required steps to build a successful ML model with Maximo Visual Inspection.

Let’s discuss Deep Learning and Image Processing

Before we dive into the platform, it is important to understand how deep learning is leveraged for image processing. Deep Learning is a subset of machine learning, where programs solve tasks without being explicitly programmed. This is especially important when dealing with images, video, or even sound. It is more specific as they train artificial neural networks. Deep learning consists of numerous layers that evolve into an eventual output classification for certain inputs.

In this case, the layers are connected with multiple nodes that create inputs for each following layer. For images, we can have dataset of images that are trained with associated labels and categories. The network will manipulate the training set throughout each layer by transforming it from one to the next. This creates a complex network that the model learns from without any influence. Once a model is trained, it can be deployed to test and provide an output category for a given image.

The figure below briefly visualizes the image processing deep learning progression for an image classification model.

What are the types of image processing ML models?

There are three different training types: Image classification, Object Detection, and Action detection.

Image Classification: Best used to recognize and categorize an image as a whole. For example, classifying a flower image either as a Rose or a Sunflower.
Object Detection: Recommended to detect and box multiple parts of an image. For example, predicting a leaf, stem, and/or petal on a flower image.
Action Detection: This is a video based ML model that detects and tags actions in a video. This type of training takes into account spatial and time data between objects and movement. For example, predicting the life stage of a flower’s growth over time.

For the purposes of this article, we will focus on Image Classification and Object Detection, where they use static images.

Step 1: Preparing your dataset in Maximo Visual Inspection

I’m going to walk you through how to prepare your dataset to train either an image classification model or an object detection in the Maximo Visual Inspection Platform

Categorizing and Labeling images before Training your models

In the Maximo Visual Inspection Platform, there is a ‘Data sets’ page that serves as your all-in-one data management for your image assets. It holds your data sets in a similar structures to a file system. Each data set is used for a single model to be trained on.

By navigating into your chosen data set, images can be easily uploaded. In this case, the “Flowers” data set contains a variety of different flower images: Roses, Sunflowers, and Daisies.

For the purposes of training and image classification model, you must ‘Assign category’ for each of the images.

For an object detection model, you must ‘Label’ each of the images.

Image Classification: Assign Category

Assigning an image to a category is very straight forward after clicking “Assign category”, a custom category name can be added and assigned.

Object Detection: Add Label

Labeling objects in an image can be streamlined, so that an object detection model can be train. For labeling, it is important to label the entire object in the image. For best practices, it is recommended to label consistently accross all images to ensure a better performing model. Below details the steps to create those annotations.

Step 2: Training your Data set

After labeling and categorizing your images, you’re all set to begin training!

In the top right corner of the Data set page, you’ll observe a “Train model” button. The button will redirect you to the Train model landing page.

There are three different training types: Image classification, Object Detection, and Action detection. As discussed above, depending on your dataset and modeling prefences, choose the best appropriate model. For either option, using the Adjust settings toggle, you may adjust the values for each of the Model hyperparameters instead of using the default values.

After tuning your parameters and choosing your model type, you’re ready to click “Train model”.

Step 3: Deploy your Trained model

Under the “Models” tab, you will be able to view the trained model performance as well as the numerous data science metrics that will be important to evaluate your model.

By training your model, you are able to view the model performance over time through the Loss vs Iteration graph. You are also able to view the performance of each category in the dataset. This can provide insights to how you can improve the model further and which categories have the weakest performance.

In order to inference and test your model, you must deploy the model.

Step 4: Test your Deployed model

Deployment is a straightforward step that involves testing your model. You are able to see how it performs on new images outside of the dataset to truly test the performance. Inferencing in machine learning is the process of running data points into a model to calculate an output score. In the MVI platform, you are able to inference a single image to get a confidence score. Deployment creates an API endpoint to programmatically create numerous inferences versus the single image in the platform. Nevertheless, you will gain important insights that can be used to improve your model further!

Below depicts two situations of inferencing either for an image classification model or an object detection model.

Inferencing an Image Classification Model

An image classification model will create a result of a confidence score and a heatmap overlay. This heatmap overlay will be essential to highlight the image’s areas of interest.

Inferencing an Object Detection Model

For an object detection model, the result will depict the confidence score for each category and bounding box over the image’s area of interest.

You’re all set!

You’ve now successfully taken a dataset of images to a fully deployed model by leveraging IBM’s Maximo Visual Inspection. For the beginner data scientist, you are now equipped on the basics to create accurate models easily and quickly.

Modeling is an iterative approach, so now you can improve your models by adding more labeled images and inferencing for insights. Now you can work on improving your current models and exploring deep learning further!

This article discussed the user-interface of Maximo Visual Inspection, but it can be accessed programmatically using a REST API for further batch modeling and inferencing. For more information on developer tools and tutorials for Maximo Visual Inspection, visit here. If you want to learn more about the Maximo Application Suite and other related tools, visit here.