Making a braille translator

Maryna Longnickel
5 min readDec 26, 2018

--

I had just finished working through an OpenCV scantron grader tutorial by the super awesome Adrian Rosebrock from https://www.pyimagesearch.com, and decided to try and apply the techniques used there to a different project.

That’s when braille came to mind. It is a type of font used by blind people, where characters are represented by patterns of raised dots. The reading is accomplished by feeling the dots with one’s fingertips. Now why would I try to translate something that’s never actually represented as an image?

For fun :)(and practice). Though I guess it could be extended to some sort of braille learning tool where one is able to scan the braille text using an app and it would read out the words.

Braille alphabet. Source: https://www.pharmabraille.com/pharmaceutical-braille/the-braille-alphabet/

To begin, I have picked an image (easy one, let’s just try the actual alphabet first). The image was slightly blurred using cv2.GaussianBlur to remove some of the noise, then erode and dilate were applied to further isolate the most prominent features.

Next, the contours can be found like so:

cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE

Here RETR_EXTERNAL flag means that only external contours will be kept (no circles inside other circles). And CHAIN_APPROX_SIMPLE means that the contour is simplified by reducing the number of points and thus saving memory.

After finding all the contours, it was reasonable to assume that since the image will be comprised mostly of black circles (dots) of the same size, the diameter of one could be determined by finding the most frequently occurring width (or height) among the bounding boxes.

Next, the contours of all circles were found by selecting only those that had width which fell within a certain range of the determined diameter (generous ±20% here since the dots in small images could be extremely blurred causing large variations in size of the bounding boxes), and had a ratio of height to width close to 1. It could also be verified that the majority of the pixels inside the contour are black to prevent selecting “empty” dots, but I omitted this step for now as eroding and dilating usually gets rid of the thin “empty” dot outlines if there are any.

To translate the image into text, I decided to first convert it into a grid of 1’s and 0’s, 1 being a black dot, and 0 being the absence of one. Knowing where the black dots are is the easy part. But to figure out the locations of the missing dots I had to calculate some distances. Here is the official US standard for braille:

Source: https://www.researchgate.net/figure/Braille-Cell-Dimensions_fig2_260845048

Naturally, unless you adhere to these ratios, the distances between dots and sets of dots (letters) will be different. So it was best to calculate them for each image individually. To make creating the grid easier, I went through and aligned all the dots in one line to all have the same y-coordinate, and all dots in one column to have the same x-coordinate. This is so the dots could be sorted in “reading order” — first from left to right, then from top to bottom. Because the pixels from the top of one dot do not necessarily align with pixels from the top of another and might be slightly lower or higher, that dot might be mistakenly placed out of order. Here, a tolerance variable is used for grouping together similar x and y coordinates. For example, bounding boxes having top left corners at (1, 8), (7, 9), (20, 4), (21,11) might become (1, 8), (7, 8), (20, 4), (20, 11). Since 8 and 9 were close together, it was assumed that the dots were on the same line, but misaligned due to noise. Same with 20 and 21.

Next, we can find the set of the differences between consecutive x-coordinates of the bounding boxes. By looking at the picture above, we can see that the smallest distance will be the one between two neighboring dots from the same letter. The next smallest distance is the white space between two letters. The next is between the same dots in adjacent letters. With this information, we can break up the picture into a grid where each cell corresponds to an empty or filled-in dot.

Now, this is an extremely hacky way of going about it! But good enough for government work. I went through and drew a vertical line(s) between consecutive x coordinates of the bounding boxes. Here prev indicates whether the last place was inside or outside of a letter since the distances between dots are slightly different. Note the tolerances since, again, distances will probably vary a little.

Having created an array of coordinates for vertical lines that roughly separate columns of dots, the grid can now be filled in. A thing to take note of here is if the entire line of letters is missing the bottom row (none of the letters in a row have black dots in positions 5 or 6) and add a row of 0’s so the dots from the rows below don’t accidentally become part of letters above them. Similarly, if every first letter on each line is missing dots in the first column (positions 1, 3, 5), we need to make sure to account for the fact that our dot array will be starting from the middle of a letter and combine pairs of columns accordingly.

At this point we have ourselves a matrix that looks something like this:

What remains is to just go through the matrix with a window of size 3x2 and stride of 2, collect the locations of 1’s into lists and convert these lists into letters using a dictionary. For example [1,2,4] corresponds to a letter “d” since those are the dots that are colored in.

For this particular picture the output is

abcdefghijklm nopqrstuvwxyz 1bcdefghij

Notice that only the first number was converted (represented as #a). That is because I haven’t gotten around to making sure the pattern indicating an integer gets extended beyond the first number yet.

Overall, this was kind of a pain, but I learned some neat tricks. The entire code can be found on my GitHub. It would be interesting to see if letter recognition technique could be applied here, since patterns of dots don’t constitute a solid contour that can be easily interpreted. Another thought was to pixelate the image so each black dot corresponds to one black pixel, maybe something like this. Should I do Morse code next? Everyone still uses that, right?

--

--