In this section we will be covering up all the image processing that must be done to extract all the digits from sudoku photo that you have taken from your camera.
I have taken this photo from my camera:
Step1: Read the image using opencv. Here original.jpg refers to my camera photo.
frame = cv2.imread('original.jpg')
Step2: Convert this image into grayscale image.
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imwrite('gray' , gray)
Step3: Use Adaptive Thresholding
- Thresholding is a very simple concept in which if pixel value is greater than a threshold value, it is assigned one value (may be white), else it is assigned another value (may be black).
- Adaptive Thresholding is used to eliminate the problem of global threshold value when you have different lightening conditions in the image. In Adaptive Thresholding we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.
There are 3 arguments in Adaptive Threshold:
- Adaptive Method — It decides how thresholding value is calculated.
- Block Size — It decides the size of neighbourhood area.
- C — It is just a constant which is subtracted from the mean or weighted mean calculated.
In this we have used cv2.ADAPTIVE_THRESH_GAUSSIAN_C as Adaptive Method where threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.
Here we have taken Block Size = 11 and C = 3.
After this operation Result will look like this :
Step4 : Opening Operation
As we can see there is so much noise in this image we can remove this noise using morphological operations in computer vision.
By carefully examining the Adaptive threshold image we can see there are lots of unwanted white dots. So to eliminate these dots we can do erosion operation. But after doing erosion we can get thin and disconnected white lines of sudoku. To make them clear and connected we have to use dilation operation. So we have to use combination of erosion and dilation operation where we have to first do the erosion operation and then dilation operation. Basically this operation is known as opening which means first doing erosion and then doing dilation operation. After opening operation we can get the below image as shown:
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))gray = cv2.morphologyEx(gray, cv2.MORPH_OPEN, kernel)cv2.imwrite('morph.jpg',gray)
Step5 : Getting Sudoku
Now from this image I only need sudoku part. To do this we will be using the concept of contours in computer vision.
Contours : Curve that is made by joining all the continuous points having same properties(color or intensity). while working with contours make sure that object should be in white color and background should be in black color.
findContours function takes 3 arguments :
- gray : threshold image where object is in white color and background is in black color.
- Contour Retrieval Method : cv2.RETR_TREE is used here which retrieves all the contours and creates a full family hierarchy list
- Contour Approximation Method : cv2.CHAIN_APPROX_SIMPLE is used here which means we will get only 4 points of a contour.
From all the contours we will take one contour which is of biggest area and we know that it will correspond to the sudoku part.
contour = max(contours, key=cv2.contourArea)
After finding the max contour part we have to crop that part . To do that first we will have to get rectangle corrdinates which embed this contour. This can be done by cv2.boundingRect which gives us straight rectangle, it doesn't consider the rotation of the object. So area of the bounding rectangle won't be minimum.
x, y, w, h = cv2.boundingRect(contour)sudoku = gray[y:y + h, x:x + w]side_length = min(sudoku.shape)sudoku = cv2.resize(sudoku, (side_length, side_length))cv2.imwrite('sudoku.jpg' , sudoku)
After finding out the x,y,w,h from boundingRect function just crop that part from the image and then resize that cropped image into minimum side length. After all these operations we will get the image like shown below:
Step6: Making Sudoku Straight
Now we observe that sudoku is not straight. Lets make it straight.
contours, h = cv2.findContours(sudoku, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
Again we find all the contours for the sudoku image using the same arguments as above but now the image is of sudoku.
contours = sorted(contours, key=cv2.contourArea, reverse=True)
Here we have sort the contours in descending order by its area.
largest = None
for cnt in contours[:min(5,len(contours))]:
print ("Length of approx(cnt) : " + str(len(approx(cnt))) )
if (len(approx(cnt)) == 4)
print ("Condition becomes True")
largest = cnt
In the above code we have used approx function which is specified below.
peri = cv2.arcLength(cnt, True)
app = cv2.approxPolyDP(cnt, 0.01 * peri, True)
In the above function we have used the concept of Contour Approximation.Basically , it approximates a contour shape to another shape with less number of vertices depending upon the precision we specify.
You can notice here I have used approxPolyDP here ,basically this function takes 3 arguments . First one is cnt which are contour points , second one is 0.01 * peri which is an epsilon which is an accuracy parameter. A wise selection of epsilon is needed to get the correct output. Third argument is true to show that curve is closed. Basically it means that our function will give approximated curve that approximates 1% of arc length.
Now back to the previous code where we have use this approx function.We have loop over the sorted contours and pass this in approx function and check whether approx(contours) is equal to 4 or not. If the condition satisfies then largest variable will be set to that contour variables.
So till now we get the largest contour points for that sudoku images.
We will be using 2 functions order_points and four_points_transform which are explained below after that we will proceed using these functions.
pts = pts.reshape(4, 2)
rect = np.zeros((4, 2), dtype = "float32")
s = pts.sum(axis = 1)
rect = pts[np.argmin(s)]
rect = pts[np.argmax(s)]
diff = np.diff(pts, axis = 1)
rect = pts[np.argmin(diff)]
rect = pts[np.argmax(diff)]
Basically this function will return ordered points which means well defined corrdinates of rectangle correspond to the largest contour points of sudoku.
Why this function is useful for us will be understood later ,but let us understand what we have done inside of this function.
First we have intialize a list of coordinates that will be ordered such that
- first entry in the list is top-left
- second entry is top-right
- third is bottom-right
- fourth is bottom-left.
In ordered corrdinates :
- top-left point will have the smallest sum
- bottom-right will have the largest sum
- After computing difference between points => Top-Right will have smallest difference
- After computing difference between points => Bottom-Left will have largest difference
After computing all these we have returned the ordered points.
Now lets talk about 2nd Function which is four_points_transform:
(tl,tr,br,bl) = rect
widthBottom = np.sqrt(((br - bl)**2) + ((br - bl) ** 2)) widthTop = np.sqrt(((tr - tl) ** 2) + ((tr - tl)**2))
maxWidth = max(widthBottom , widthTop)
heightA = np.sqrt(((tr - br) ** 2) + ((tr - br) ** 2))
heightB = np.sqrt(((tl - bl) ** 2) + ((tl - bl) ** 2))
maxHeight = max(int(heightA), int(heightB))
dst = np.array([
[maxWidth - 1,0],
[maxWidth-1 , maxHeight-1],
[0,maxHeight-1]] , dtype = 'float32')
M = cv2.getPerspectiveTransform(rect,dst)
As you can see that this function takes 2 arguments -> image (which is previous skewed image of sudoku) and rect (which is ordered points)
- From rect we get 4 points which corresponds to (Top-Left , Top-Right , Bottom-Right , Bottom-Left)
- Now we have to find Bottom-Width and Top-Width using above points and we can find that using euclidian distance. Logic for finding that is shown in the code. After that we find out maximum of both Bottom-Width and Top-Width . Same procedure is done height also.
- After finding out maxWidth and maxHeight we have made an array of 4 points corresponding to destination points. These 4 points are given as : [0,0] , [maxWidth-1 , 0] , [maxWidth-1,maxHeight-1] , [0,maxHeight-1]
- Then we find out the transformation matrix which changes the ordered points to destination points (cv2.getPerspectiveTransform)
- After finding out the transformation matrix , we apply this transformation to the image points(cv2.warpPerspective)
- After applying transformation, we returned these transformed points.
Now we are at final steps of image processing,
if (largest is not None):
app = approx(largest)
print ("App: " + str(len(app)))
corners = order_points(app)
sudoku = four_points_transform(sudoku,corners)
print ("Done Straighten !!")
In the above code we get the approximated 4 points using approx function which is explained above. After that we pass these points to order_points function to get the ordered points which I named as corner . We pass these points to the four_points_transform to get the actual straight sudoku that is shown below:
In this part of blog we have extracted the sudoku effectively from the camera image . In next part we will extract these cells from the image and get a number corresponding to each cell .