let's Look into the IRIS segmentation, well, It is not a segmentation, to be honest, you only get four landmarks of IRIS from Mediapipe but we can turn these landmarks into segmentation, as well.
CodeBase:
You will find all the source code on Github Repository, here I am going to go explain a few code snippets.
Requirement:
You need python, install on your machine, another requirement is OpenCV and NumPy, but they come packed with Mediapipe, as a requirement when you install through PIP(python package manager) so no need to do it manually.
Installation
In case you have already installed the Mediapipe and its less than (0.8.9.1) version then upgrade to the latest version
pip install --upgrade mediapipe
Face Landmarks
Mediapipe provides, 478 landmarks of the face, you can find more details about Face mesh, here we gonna focus on the IRIS landmarks only since we are going the store all the landmarks in the NumPy array, so you can access them, bypass the list of indices. here are lists of IRIS Landmarks extracted using Face Mesh point’s map
LEFT_IRIS = [474,475, 476, 477]
RIGHT_IRIS = [469, 470, 471, 472]
For Eyes (indices)
# Left eye indices list
LEFT_EYE =[ 362, 382, 381, 380, 374, 373, 390, 249, 263, 466, 388, 387, 386, 385,384, 398 ]# Right eye indices list
RIGHT_EYE=[ 33, 7, 163, 144, 145, 153, 154, 155, 133, 173, 157, 158, 159, 160, 161 , 246 ]
Coding Part
Modules imports
import mediapipe as mp
import cv2 as cv
import numpy as npmp_face_mesh = mp.solution.face_mesh
Mode configuration
max_num_faces: number of faces detected
refine_landmarks: refine the landmarks for Eyes, Lips and add the additional landmarks for Irises of Eye, that are not available in previous models.
min_detection_confidence: (0.0, 1) minimum detection confidence for face detection model.
min_tracking_confidence:(0.0, 1) minimum confidence for landmarks tracking, for landmarks tracker model.
loading the Face Mesh model.
with mp_face_mesh.FaceMesh(
max_num_faces=1,
refine_landmarks=True,
min_detection_confidence=0.6,
min_tracking_confidence=0.6
) as face_mesh:
because we going the run this in real time I am going to call the image a frame, which will make sense, here, first need to flip the camera frame to mirror image, by using a function from OpenCV, since Mediapipe needs RGB colour format but OpenCV uses BGR neet to change the colour, here, cvtColor function.
frame = cv.flip(frame, 1)
rgb_frame = cv.cvtColor(frame, cv.COLOR_BGR2RGB)
when the RGB frame is processed by the face mesh model, it going to return, 478 landmarks each detected face, with every landmark having x, y and z values, each having a value is between 0 to 1, other words normalized values, then we need to multiple then with the corresponding scaling to get pixels coordinates in the frame,
for the X, scaling is width, Y is the height of the image, for Z it's the same as x, width
results = face_mesh.process(rgb_frame)#getting width and height or frame
img_h, img_w = frame.shape[:2]
Iterating through landmarks
When we process the RGB frame we will get, each detected face, its landmarks, so we access the landmarks from the results variable we have to store them, it will be like results.multi_face_landmarks here we have all faces landmark, you can loop through them since I have detected for as ingle face, so I am going to provide index here results.multi_face_landmarks[0], which is formatted like this
landmark {
x: 0.6233813166618347
y: 0.7154796719551086
z: -0.0638529509305954
}
now you will get the landmarks to face, just calling the results.multi_face_landmarks[0]. landmarks it has to normalize values for [x, y, z] if print the type you will get
<class 'google.protobuf.pyext._message.RepeatedCompositeContainer'>
when you loop through results.multi_face_landmarks[0]. landmarks you will get x, y and z for each landmark.
[print(p.x, p.y, p.z )for p in results.multi_face_landmarks[0].landmark]
but you still have normalised values, so you need multiply each value with proper scaling and you will get pixel coordinate, [ x*img_w, y*img_h, z*img_w], but here, we just need x, and y, only, and I am going to use NumPy’s multiply function to achieve that, don’t forget, to convert them to integers, since OpenCV accept in as pixel coordinate as int. here is a simple one-liner, which does the job for us, end the end I have stored all the landmarks in the NumPy array (mesh_points) so it is easier to access bypass the list of indices
mesh_points=np.array([np.multiply([p.x, p.y], [img_w, img_h]).astype(int) for p in results.multi_face_landmarks[0].landmark])
now we can draw the irises the landmarks using the OpenCV function, polyline, we already have the list of indices of irises, using them to get iris coordinate,
cv.polylines(frame, [mesh_points[LEFT_IRIS]], True, (0,255,0), 1, cv.LINE_AA)cv.polylines(frame, [mesh_points[RIGHT_IRIS]], True, (0,255,0), 1, cv.LINE_AA)
It will look something like that.
but we can turn these square shapes into circles since their function OpenCV provides enclosing circles based on provided points. named minEnclosingCircle which return, the centre (x,y) and radius of circles, ⚠ return values are floating-point, we have to turn them to int.
(l_cx, l_cy), l_radius = cv.minEnclosingCircle(mesh_points[LEFT_IRIS])(r_cx, r_cy), r_radius = cv.minEnclosingCircle(mesh_points[RIGHT_IRIS])# turn center points into np array
center_left = np.array([l_cx, l_cy], dtype=np.int32)
center_right = np.array([r_cx, r_cy], dtype=np.int32)
finally draw the circle based on returns values from the minEnclosingCircle function, through circle function which draws circle image based on centre(x,y) and radius
cv.circle(frame, center_left, int(l_radius), (255,0,255), 2, cv.LINE_AA)
cv.circle(frame, center_right, int(r_radius), (255,0,255), 2, cv.LINE_AA)
finally getting the segmentation mask, which is quite simple, just you to create an empty mask(image) using NumPy's zeroes function, having the same dimension as a frame, and you can draw a white circle on the mask, you have the segmentation mask.
creating a mask, using image dimension width and height of the frame.
mask = np.zeros((img_h, img_w), dtype=np.uint8)
drawing white circles on mask
cv.circle(mask, center_left, int(l_radius), (255,255,255), -1, cv.LINE_AA)cv.circle(mask, center_right, int(r_radius), (255,255,255), -1, cv.LINE_AA)
since you got an Irises to mask, you can replace them with any coloured iris image to create different Instagram filters, or eyes controlled cursor(pointer) 🖱.
I have a complete video tutorial on youtube you can check out that as well if you want, link in reference
since I am a newbie in writing, it is obvious that you find mistakes, please let me know if you found, I will be happy to fix them, thank you so much.
Here is the video Tutorial: