Real time image recognition on stream?
Image recognition is now a somewhat mature science, and so we can get down to the business of using it to make life better. This articles is going to show how I’ve been experimenting with an inceptionV3 and a Mobilenet to do just that.
Image recognition via-API?
Usually we deploy data-science product via scalable API in the cloud. This is good because you keep the hard work on large machines, and the client simply has to send and receive JSON.
Image recognition is different, because you often want to predict on the camera stream, which means sending a stream of images over the internet from a phone with limited data. Not good! Here’s an example doing just that — this is an inceptionV3 trained on image data from Trade Me. The training was done on a GPU via Keras, first training the output layer, then the earlier layers. Test accuracy got up to a little past 70% on the first run on 450 classes.
Ignore the start price suggestions! (these really use different input data, but were hooked up to demonstrate chained services). As you can see, it takes a while to send, processes and receive requests from the camera stream. Here I’m using OpenCV to parse the images and feed to a Keras model. Shout out to Dat Tran for his sweet starter code, and Joris Coppieters for his choice service delivery framework and help.
The keras code
import keras
from keras.applications.mobilenet import MobileNet
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping, TensorBoard, ModelCheckpointtrain_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)train_generator = train_datagen.flow_from_directory(
'./full_classed/',
target_size=(299, 299),
batch_size=32,
class_mode='categorical')val_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)val_generator = val_datagen.flow_from_directory(
'./test_class/',
target_size=(299, 299),
batch_size=32,
class_mode='categorical')tb = TensorBoard(log_dir='./v1/logs', histogram_freq=1, write_graph=False, write_images=True)es= EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=0, mode='auto')mc = ModelCheckpoint('./v1/chkpnt', monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', period=1)# create the base pre-trained model
base_model = MobileNet(weights='imagenet', include_top=False)# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 100 classes
predictions = Dense(100, activation='softmax')(x)# first: train only the top layer
for layer in base_model.layers:
layer.trainable = Falsemodel.compile(optimizer='adam', loss='categorical_crossentropy')
model = Model(inputs=base_model.input, outputs=predictions)# train the model on the new data for a few epochs
model.fit_generator(train_generator,
steps_per_epoch=XXX,
callbacks=[tb, es, mc],
validation_data=val_generator,
validation_steps=XX)# Now make the whole model trainable:for layer in model.layers:
layer.trainable = True# And recompile
# we use SGD with a low learning rate as per Francois' recommendationsfrom keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9), loss='categorical_crossentropy')
Bringing it onboard?
Recent developments such as MobileNet and optimized TensorFlow graphs mean models are getting smaller and smaller. With MobileNet I was able to get a model down to just 4 mb — small enough to package and ship.
Checkout the follow up post for bringing it on board a mobile phone.
