Fast, Visual, and Explainable ML Modeling With PerceptiLabs

ODSC - Open Data Science
4 min readApr 19, 2021

Pure-code ML frameworks like TensorFlow, have become popular for building ML models because they effectively offer a high-level grammar for describing model topologies and algorithms. This is a powerful approach, but it has limitations for providing insight and explainability of models. These issues are further magnified when models become larger and more sophisticated, and when you need confidence in your real-world deployment.

That’s why we created PerceptiLabs, a TensorFlow-based visual modeling tool, to provide a faster and easier way to build and train ML models using visualizations.

For comparison of approaches, let’s look at doing transfer learning with an image classification model. We’ll take an existing model trained with a general set of images from ImageNet, and retrain it to classify a group of blood cell images (e.g., for a medical or healthcare use case).

The TensorFlow code to accomplish this might look as follows:

import plotly.express as px
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
train_dir = '../input/blood-cells/dataset2-master/dataset2-master/images/TRAIN'
test_dir = '../input/blood-cells/dataset2-master/dataset2-master/images/TEST'
# Create generators
train_gen = tf.keras.preprocessing.image.ImageDataGenerator(
preprocessing_function = tf.keras.applications.mobilenet_v2.preprocess_input ,
validation_split= 0.2
)
test_gen = tf.keras.preprocessing.image.ImageDataGenerator(
preprocessing_function = tf.keras.applications.mobilenet_v2.preprocess_input ,
validation_split= 0.2
)
# Flow image data
train_images = train_gen.flow_from_directory(
directory = train_dir , target_size = (224,224) , color_mode = 'rgb' ,
class_mode = 'categorical' , batch_size = 32 , shuffle= True , seed = 42,
subset = 'training'
)
val_images = train_gen.flow_from_directory(
directory = train_dir , target_size = (224,224) , color_mode = 'rgb' ,
class_mode = 'categorical' , batch_size = 32 , shuffle= True , seed = 42,
subset = 'validation'
)
test_images = test_gen.flow_from_directory(
directory = test_dir , target_size = (224,224) , color_mode = 'rgb' ,
class_mode = 'categorical' , batch_size = 32 , shuffle= False , seed = 42
)
pretrained_model = tf.keras.applications.MobileNetV2(
input_shape=(224, 224, 3),
include_top=False,
weights='imagenet',
pooling='avg'
)
pretrained_model.trainable = False
inputs = pretrained_model.input
x = tf.keras.layers.Dense(128,activation = 'relu')(pretrained_model.output)
outputs = tf.keras.layers.Dense(4,activation='softmax')(x)
model = tf.keras.Model(inputs = inputs , outputs = outputs)
model.compile(
optimizer = 'adam' ,
loss = 'categorical_crossentropy' ,
metrics = ['accuracy']
)
print(model.summary())
# Training
history = model.fit(
train_images ,
validation_data = val_images ,
epochs = 100 ,
callbacks = [
tf.keras.callbacks.EarlyStopping(
monitor = 'val_loss' ,
patience = 3,
restore_best_weights = True
)
]
)

While this code certainly does the trick, a pure-code solution introduces a number of challenges to the modeling process. Firstly, with all that code, it’s difficult to understand the model’s architecture. It requires you to read through all the lines of code to build a mental model of what’s going on, assuming you understand code. After this inspection, it’s hard to weed out specific details such as whether a full model is loaded or if the final layers are excluded (when doing transfer learning), or which parameters to tune. It’s also difficult to visualize how data is transformed by the various elements of the model, especially when the model is changed. Finally, there is little insight to help you explain the resulting predictions.

Visual Modeling with PerceptiLabs

At the heart of PerceptiLabs are pre-made components which wrap TensorFlow code and abstract them into visual and connectable components, while still allowing for custom code changes. This visual API allows you to drag and drop components and connect them into a topology that represents your model’s architecture. This GUI makes it easy to add new elements like one-hot encoding and dense layers. Each component also provides visual output on how it has transformed the data as you change the model in PerceptiLabs. This instant visualization eliminates the need to run the whole model before seeing results, giving you quicker iteration.

When comparing PerceptiLabs to the TensorFlow code above, you can immediately see how it facilitates the visualization of image and label data. You can also see how each component transforms that data, and how those transformations led to the final classification. PerceptiLabs loads and uses the first piece of available training data during modeling, and immediately re-runs the model as you make changes, so you can instantly see how your changes affect your result. This important feature eliminates the need to run the model on your entire training data set before you can see results. When you are ready to train your model, PerceptiLabs displays a rich set of statistics updated in real-time, showing you how the learning is progressing and how the model handles verification data.

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.