IRIS Segmentation Mediapipe Python

AiPhile
5 min readJan 23, 2022

--

Demo Video

let's Look into the IRIS segmentation, well, It is not a segmentation, to be honest, you only get four landmarks of IRIS from Mediapipe but we can turn these landmarks into segmentation, as well.

CodeBase:

You will find all the source code on Github Repository, here I am going to go explain a few code snippets.

Requirement:

You need python, install on your machine, another requirement is OpenCV and NumPy, but they come packed with Mediapipe, as a requirement when you install through PIP(python package manager) so no need to do it manually.

Installation

In case you have already installed the Mediapipe and its less than (0.8.9.1) version then upgrade to the latest version

pip install --upgrade mediapipe 

Face Landmarks

Mediapipe provides, 478 landmarks of the face, you can find more details about Face mesh, here we gonna focus on the IRIS landmarks only since we are going the store all the landmarks in the NumPy array, so you can access them, bypass the list of indices. here are lists of IRIS Landmarks extracted using Face Mesh point’s map

LEFT_IRIS = [474,475, 476, 477]
RIGHT_IRIS = [469, 470, 471, 472]

For Eyes (indices)

# Left eye indices list
LEFT_EYE =[ 362, 382, 381, 380, 374, 373, 390, 249, 263, 466, 388, 387, 386, 385,384, 398 ]
# Right eye indices list
RIGHT_EYE=[ 33, 7, 163, 144, 145, 153, 154, 155, 133, 173, 157, 158, 159, 160, 161 , 246 ]
Eyes Landmarks

Coding Part

Modules imports

import mediapipe as mp
import cv2 as cv
import numpy as np
mp_face_mesh = mp.solution.face_mesh

Mode configuration

max_num_faces: number of faces detected

refine_landmarks: refine the landmarks for Eyes, Lips and add the additional landmarks for Irises of Eye, that are not available in previous models.

min_detection_confidence: (0.0, 1) minimum detection confidence for face detection model.

min_tracking_confidence:(0.0, 1) minimum confidence for landmarks tracking, for landmarks tracker model.

loading the Face Mesh model.

with mp_face_mesh.FaceMesh(
max_num_faces=1,
refine_landmarks=True,
min_detection_confidence=0.6,
min_tracking_confidence=0.6
) as face_mesh:

because we going the run this in real time I am going to call the image a frame, which will make sense, here, first need to flip the camera frame to mirror image, by using a function from OpenCV, since Mediapipe needs RGB colour format but OpenCV uses BGR neet to change the colour, here, cvtColor function.

frame = cv.flip(frame, 1)
rgb_frame = cv.cvtColor(frame, cv.COLOR_BGR2RGB)

when the RGB frame is processed by the face mesh model, it going to return, 478 landmarks each detected face, with every landmark having x, y and z values, each having a value is between 0 to 1, other words normalized values, then we need to multiple then with the corresponding scaling to get pixels coordinates in the frame,

for the X, scaling is width, Y is the height of the image, for Z it's the same as x, width

results = face_mesh.process(rgb_frame)#getting width and height or frame
img_h, img_w = frame.shape[:2]

Iterating through landmarks

When we process the RGB frame we will get, each detected face, its landmarks, so we access the landmarks from the results variable we have to store them, it will be like results.multi_face_landmarks here we have all faces landmark, you can loop through them since I have detected for as ingle face, so I am going to provide index here results.multi_face_landmarks[0], which is formatted like this

landmark {
x: 0.6233813166618347
y: 0.7154796719551086
z: -0.0638529509305954
}

now you will get the landmarks to face, just calling the results.multi_face_landmarks[0]. landmarks it has to normalize values for [x, y, z] if print the type you will get

<class 'google.protobuf.pyext._message.RepeatedCompositeContainer'>

when you loop through results.multi_face_landmarks[0]. landmarks you will get x, y and z for each landmark.

[print(p.x, p.y, p.z )for p in results.multi_face_landmarks[0].landmark]

but you still have normalised values, so you need multiply each value with proper scaling and you will get pixel coordinate, [ x*img_w, y*img_h, z*img_w], but here, we just need x, and y, only, and I am going to use NumPy’s multiply function to achieve that, don’t forget, to convert them to integers, since OpenCV accept in as pixel coordinate as int. here is a simple one-liner, which does the job for us, end the end I have stored all the landmarks in the NumPy array (mesh_points) so it is easier to access bypass the list of indices

mesh_points=np.array([np.multiply([p.x, p.y], [img_w, img_h]).astype(int) for p in results.multi_face_landmarks[0].landmark])
graph

now we can draw the irises the landmarks using the OpenCV function, polyline, we already have the list of indices of irises, using them to get iris coordinate,

cv.polylines(frame, [mesh_points[LEFT_IRIS]], True, (0,255,0), 1, cv.LINE_AA)cv.polylines(frame, [mesh_points[RIGHT_IRIS]], True, (0,255,0), 1, cv.LINE_AA)

It will look something like that.

Irises Landmarks

but we can turn these square shapes into circles since their function OpenCV provides enclosing circles based on provided points. named minEnclosingCircle which return, the centre (x,y) and radius of circles, ⚠ return values are floating-point, we have to turn them to int.

(l_cx, l_cy), l_radius = cv.minEnclosingCircle(mesh_points[LEFT_IRIS])(r_cx, r_cy), r_radius = cv.minEnclosingCircle(mesh_points[RIGHT_IRIS])# turn center points into np array 
center_left = np.array([l_cx, l_cy], dtype=np.int32)
center_right = np.array([r_cx, r_cy], dtype=np.int32)

finally draw the circle based on returns values from the minEnclosingCircle function, through circle function which draws circle image based on centre(x,y) and radius

cv.circle(frame, center_left, int(l_radius), (255,0,255), 2, cv.LINE_AA)
cv.circle(frame, center_right, int(r_radius), (255,0,255), 2, cv.LINE_AA)
drawing circle on Irises
Circles draw on Iris

finally getting the segmentation mask, which is quite simple, just you to create an empty mask(image) using NumPy's zeroes function, having the same dimension as a frame, and you can draw a white circle on the mask, you have the segmentation mask.

creating a mask, using image dimension width and height of the frame.

mask = np.zeros((img_h, img_w), dtype=np.uint8)

drawing white circles on mask

cv.circle(mask, center_left, int(l_radius), (255,255,255), -1, cv.LINE_AA)cv.circle(mask, center_right, int(r_radius), (255,255,255), -1, cv.LINE_AA)
results with segmented mask

since you got an Irises to mask, you can replace them with any coloured iris image to create different Instagram filters, or eyes controlled cursor(pointer) 🖱.

I have a complete video tutorial on youtube you can check out that as well if you want, link in reference

since I am a newbie in writing, it is obvious that you find mistakes, please let me know if you found, I will be happy to fix them, thank you so much.

Here is the video Tutorial:

Reference

--

--