Day 69 of 100DaysofML

Charan Soneji
100DaysofMLcode
Published in
3 min readSep 8, 2020

Image Processing part 2. Thought of working a bit more on the Image processing syntax and concepts so lets get right to it. I would highly recommend reading the last blog before you read this else it would get very confusing.

So one of the issues that we faced in the last blog was the size of the objects identified in the image and a proper analysis on those objects.

Continuation of code from last blog.

for label_ind, label_coords in enumerate(ndimage.find_objects(labels)):
cell = im_gray[label_coords]

if np.product(cell.shape) < 10:
print('Label {} is too small! Setting to 0.'.format(label_ind))
mask = np.where(labels==label_ind+1, 0, mask)


labels, nlabels = ndimage.label(mask)
print('There are now {} separate components / objects detected.'.format(nlabels))

So basically in the above given lines of code, we try to identify the dimensions of the objects in the image and if they are below a given threshold, we try to send a print message. Let us check the output.

Let us try and see how the code tries to identify the size of the objects which are being identified.

fig, axes = plt.subplots(1,6, figsize=(10,6))

for ii, obj_indices in enumerate(ndimage.find_objects(labels)[0:6]):
cell = im_gray[obj_indices]
axes[ii].imshow(cell, cmap='gray')
axes[ii].axis('off')
axes[ii].set_title('Label #{}\nSize: {}'.format(ii+1, cell.shape))

plt.tight_layout()
plt.show()

Label #2 has the “adjacent cell” problem: the two cells are being considered part of the same object. One thing we can do here is to see whether we can shrink the mask to “open up” the differences between the cells. This is called mask erosion. We can then re-dilate it to to recover the original proportions.

We shall now create a mask in order to identify the objects in a much better manner.

two_cell_indices = ndimage.find_objects(labels)[1]
cell_mask = mask[two_cell_indices]
cell_mask_opened = ndimage.binary_opening(cell_mask, iterations=8)
fig, axes = plt.subplots(1,4, figsize=(12,4))

axes[0].imshow(im_gray[two_cell_indices], cmap='gray')
axes[0].set_title('Original object')
axes[1].imshow(mask[two_cell_indices], cmap='gray')
axes[1].set_title('Original mask')
axes[2].imshow(cell_mask_opened, cmap='gray')
axes[2].set_title('Opened mask')
axes[3].imshow(im_gray[two_cell_indices]*cell_mask_opened, cmap='gray')
axes[3].set_title('Opened object')
for ax in axes:
ax.axis('off')
plt.tight_layout()
plt.show()

Finally, we need to encode each label_mask into a "run line encoded" string. Basically, we walk through the array, and when we find a pixel that is part of the mask, we index it and count how many subsequent pixels are also part of the mask. We repeat this each time we see new pixel start point.

I found a nice function to do RLE from Kaggle user Rakhlin’s kernel, which I’ve copied here.

def rle_encoding(x):
'''
x: numpy array of shape (height, width), 1 - mask, 0 - background
Returns run length as list
'''
dots = np.where(x.T.flatten()==1)[0] # .T sets Fortran order down-then-right
run_lengths = []
prev = -2
for b in dots:
if (b>prev+1): run_lengths.extend((b+1, 0))
run_lengths[-1] += 1
prev = b
return " ".join([str(i) for i in run_lengths])

print('RLE Encoding for the current mask is: {}'.format(rle_encoding(label_mask)))
RLE Encoding

Run-length encoding is a form of lossless data compression in which runs of data are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs.

That covers most of the basics of Image Processing. Thanks for reading. Keep Learning.

Cheers.

--

--