Using CRF for Image Segmentation in Python step 1

Recently I’ve had an application in mind where I needed multi-label image segmentation. There are a lot of techniques out there but I choose an approach called Conditional Random Field.

My language of choice is Python, and after a lot of googling i’ve actually found what seems to be a great library for a lot of structured learning called pyStruct, written by Andreas Mueller (which also contributes to scikit-learn)

So why am I writing this step by step tale of how I got the program working? Well there is a great example on multi label image segmentation on http://pystruct.github.io/auto_examples/image_segmentation.html however, it is not explained how to create the training data and what type of data that is needed and so on. (right now Andreas Mueller actually started a tutorial for this as well, https://groups.google.com/forum/#!forum/pystruct)

So here goes:

First off I expect you to have some knowledge regarding CRF:s, however I might write something up later on on this as well.

  1. Prepare our image training set. I didn’t want to use a generic dataset (VOC or MRSC) I wanted to create my own. I only wanted to segment and label people and bikes in an image. I took a bunch of pictures of different settings where people and bikes were present and put them in a folder. They were all quite big so I wrote a small Python program to make them all about 640x480.
import numpy as np  
import Image
import sys
import imtools
import os
i = 0
path ="bikescars"
im_list = imtools.get_imlist(path)
print np.shape(im_list) 
print im_list
size = 640,480
for idx, im_name in enumerate(im_list):
print im_name
im = Image.open(im_name)
im.thumbnail(size, Image.ANTIALIAS) im.save('imgdata/'+os.path.basename(im_name))
print idx

where imtools is a function that looks like this:


return [os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg') or f.endswith('.gif')]

(This snippet is from Jan Erik Solemns book Computer Vision in Python, excellent book, google it!)

basically taking all the files in a folder with file endings jpg or gif. I guess e.g. “glob” could also be used.

Now you have a folder with say ~100 images, what you need to do now is manually label them. Use different colors for different classes and make sure they differ as much ass possible, i used (255,0,0) for people and (0,0,255) for bikes and (255,255,255) for void/background (the rest). I did it in Photoshop, but you could use GIMP as well.

Create a new layer, zoom in, outline each person and bike in your image using the specified colors as pixel-perfect as possible (no need to be to perfect, because you will need to do this for all your 100 images). Save the image as a .jpg “_training.jpg

Next post will describe how to prepare all your training images and the superpixel extraction using SLIC.


Originally published at sloblog.io.