ChiPy Mentorship Blog Part 2
(aka The Best ChiPy Mentorship Spring 2017 Blog Post Ever — Part 2!)
First a PSA
Stop whatever you’re doing and watch the first four episodes of the new Twin Peaks.
Previously on The Best ChiPy Mentorship Spring 2017 Blog Post Ever …
We looked at how Python can be used to process image files using OpenCV. We also looked at the problem defining this project — a Kaggle competition that asks for an algorithm to count and classify sea lions in a given image file.
Today, we’ll look at how to find the coordinates of the colored pixels used to label the sea lions in the image files. Once we have the coordinates, we can then try to extract images of individual sea lions. Once we have images of the individual sea lions, then we can train a neural network to count and classify sea lions in other images.
Beware the Blob!
If you recall from the previous post, we were able to isolate the single pixels that labeled the sea lions in an image, as shown below.
Now, I want to find the coordinates of these pixels. To do this, I will use a technique called blob detection. A blob in an image is a region which contrasts sharply with the rest of the image — e.g. by contrasting intensities, different colors, or some other criterion. In the above image, there are two large blobs: one at the upper left-hand corner and the other covering most of the bottom of the image. The other blobs are the individual pixels representing the sea lions.
To simplify this process, I want to get rid of the two big blobs. First we’ll convert our images to grayscale. Then we’ll use these grayscale images to mask (or filter) out both big blobs using the following block of code:
image_1 = cv2.imread("data-sets/TrainSmall2/TrainDotted/" + filename)
image_2 = cv2.imread("data-sets/TrainSmall2/Train/" + filename)
# absolute difference between Train and TrainDotted
image_3 = image_diff(image_1,image_2)
# mask out blackened regions from TrainDotted
mask_1 = cv2.cvtColor(image_1, cv2.COLOR_BGR2GRAY)
mask_1[mask_1 < 20] = 0
mask_1[mask_1 > 0] = 255
mask_2 = cv2.cvtColor(image_2, cv2.COLOR_BGR2GRAY)
mask_2[mask_2 < 20] = 0
mask_2[mask_2 > 0] = 255
image_4 = cv2.bitwise_or(image_3, image_3, mask=mask_1)
image_5 = cv2.bitwise_or(image_4, image_4, mask=mask_2)
The first two images are the corresponding pair of dotted and original images. The third image is the difference between them — i.e. the image shown above. Next we define two more images called
mask_1, we convert
image_1 to a grayscale image:
Recalling that a grayscale image is a 2-dimensional array of intensities (i.e. numbers), we then take
mask_1 and make all of its intensities either 0 (i.e. black) if the intensity is small or 255 if the intensity is not 0. The resulting mask is shown below:
mask_1 via the
bitwise_or operator to
image_1 give us the desired result:
To find the coordinates of the pixels, we use the scikit-image library which has a method
blob_log() that finds blobs on grayscale images. In the following block of code, we use
blob_log() to find the coordinates of each pixel. Then we find the color of the pixel in the dotted image and depending on the color, we insert the coordinates into a data frame called
coordinates_df which has columns for adult males, subadult males, adult females, juveniles, and pups.
# convert to grayscale to be accepted by skimage.feature.blob_log
image_3 = cv2.cvtColor(image_3, cv2.COLOR_BGR2GRAY)
# detect blobs
blobs = skimage.feature.blob_log(image_3, min_sigma=3, max_sigma=4, num_sigma=1, threshold=0.02)
adult_males = 
subadult_males = 
pups = 
juveniles = 
adult_females = 
for blob in blobs:
# get the coordinates for each blob
y, x, s = blob
# get the color of the pixel from Train Dotted in the center of the blob
g,b,r = image_1[int(y)][int(x)][:]
# decision tree to pick the class of the blob by looking at the color in Train Dotted
if r > 200 and g < 50 and b < 50: # RED
cv2.circle(image_circles, (int(x),int(y)), 20, (0,0,255), 10)
elif r > 200 and g > 200 and b < 50: # MAGENTA
cv2.circle(image_circles, (int(x),int(y)), 20, (250,10,250), 10)
elif r < 100 and g < 100 and 150 < b < 200: # GREEN
cv2.circle(image_circles, (int(x),int(y)), 20, (20,180,35), 10)
elif r < 100 and 100 < g and b < 100: # BLUE
cv2.circle(image_circles, (int(x),int(y)), 20, (180,60,30), 10)
elif r < 150 and g < 50 and b < 100: # BROWN
cv2.circle(image_circles, (int(x),int(y)), 20, (0,42,84), 10)
# cv2.rectangle(cut, (int(x)-112,int(y)-112),(int(x)+112,int(y)+112), 0,-1)
coordinates_df["adult_males"][filename] = adult_males
coordinates_df["subadult_males"][filename] = subadult_males
coordinates_df["adult_females"][filename] = adult_females
coordinates_df["juveniles"][filename] = juveniles
coordinates_df["pups"][filename] = pups
Running the code on the above image (43.jpg) gives us
43.jpg [(3365, 3405), (3295, 3267), (2636, 3115), (28...
43.jpg [(2999, 3583), (3801, 3125), (1817, 2212), (27...
43.jpg [(3319, 3622), (3203, 3606), (3097, 3504), (30...
43.jpg [(3168, 3552), (3152, 3542), (3408, 3351), (25...
43.jpg [(3268, 3585), (3279, 3572), (3029, 3513), (29...
Extract the Sea Lions
Now that we know the coordinates of where the sea lions are located, we extract images of the individual sea lions that will be used to train our classifier. In the following code block, we extract 64-by-64 sub-images from the image:
x = 
y = 
for filename in file_names:
image = cv2.imread("data-sets/TrainSmall2/Train/" + filename)
for lion_class in classes:
for coordinates in coordinates_df[lion_class][filename]:
thumb = image[coordinates-32:coordinates+32,coordinates-32:coordinates+32,:]
if np.shape(thumb) == (64, 64, 3):
x = np.array(x)
y = np.array(y)
Let’s look at the first 10 of each type of sea lion
We note that there are only four subadult males in the image.
Challenges and Next Steps
Notice that in the above that some thumbnails have more than one sea lion in it. I need to refine my extraction method.
The biggest challenge I’ve faced is how to deal with the data set. It is a 96 GB file. I cannot download this to my local machine (it would take too long and I don’t have the room for it on my crappy laptop). So, I have devoted a lot of time to learning how to work in the cloud. Nolan (my mentor) pointed me toward the website for Stanford University’s CS231N course. In particular, it has detailed tutorial on how to set up and work in a google cloud environment (gce). I’m still trying to figure out to pull Kaggle data sets onto my gce.
Once I have these issues sorted out, I’ll be ready to build a neural network that classifies the image files.
Until next time …