Skin Segmentation and Dominant Tone/Color Extraction
Hello World!!! Have you ever looked at your skin and wondered why there are different shades in different parts of the body? Well go ponder and come back to this article.
Welcome Back !!! Our goal today together is to find a way to extract “skin” from an image and find it’s color/tone. So, what we are going to do is image segmentation and color extraction.
The direct inspiration for this project comes from me reading upon color segmentation with OpenCV while biting my nails . “I am a nail biter and I am Proud “(Spongebob tone). So I thought why not apply some machine learning and avenge the skin I bit off.
So these are the finding of my research
- OpenCV is an awesome library for image processing task
- Color Segmentation can be done using thresholding in different color spaces
- Clustering is an awesome way of grouping unlabeled data
Today we will be learning to use OpenCV to segment the skin and use Sci Kit learn to perform K-Means clustering to find the dominant skin color.
I’m writing this article with under the assumption you know basic python and understand OpenCV. Even so, we will cover high-level understanding of K-Means and few methods of OpenCV. We will also discuss different color spaces.
I believe that code for everything exists on the internet. ‘Why?’ you ask? Well due to the wonderful open source community and basic humanity. It all matters on finding it. Having said that, I am not fond of copy-paste code unless I understand it. This article is focused on explaining the concept behind the code.
You can find the link to the Google Collab Notebook containing the code for this project at the end of the article.
Interested in reading further? Let’s get into it then !!!
“OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at real-time computer vision” — https://en.wikipedia.org/wiki/OpenCV
Under the assumption that you know what a “library” means in the field of computer science, let’s break that definition. The key focus is on the word “Computer Vision”. So what is Computer Vision?
“Computer vision is concerned with the automatic extraction, analysis, and understanding of useful information from a single image or a sequence of images. It involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding.” — http://www.bmva.org/visionoverview
Well thus far, that is the best explanation I have come across for Computer Vision, short but with depth. On a high level, we understand there is a bit of processing done using the algorithm on the image to extract and analyze information.
Computer Vision being a sub-field of Artificial Intelligence, it encompasses applications from simple object recognition to production level robotics. One example use case that everyone that can relate to is Facebook suggesting to tag yourself or your friends when you upload a photo.
So let use the high-level explanations, and make a plain English definition for OpenCV
“OpenCV is a Computer Vision library that helps us extract/analyze data from images or video using image processing techniques”.
I will let your curious mind find more about image processing and these so-called image processing techniques. Spoiler Alert! If you have a photo editing app on your laptop or mobile, you are already an image processor ;)
What’s Sci-Kit Learn?
“Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN” — https://en.wikipedia.org/wiki/Scikit-learn
Actually, if you are in the machine learning space, sci-kit learn maybe not be alien to you. There might have been a countless amount of times you have come across “sklearn”, when you entered the space.
I am under the assumption that you have a basic understanding of machine learning. For the sake of wholesome post, let’s recap its definition;
Machine learning is sub-field of Artificial Intelligence which focus is to provide a computer/system with algorithms to learn automatically from data without being explicitly programmed., It basically means that allowing the computer to make a prediction or assumption without “if else” statements.
Machine Learning Algorithms includes
- Supervised learning
- Unsupervised learning
- Reinforcement learning
I’ll be covering Machine Learning in a different article, for now, let’s get back to SciKit-learn.
So let’s define Scikit-learn as a library that has a rich suite of simple clean API for completing the machine learning pipeline, from loading and manipulating data to the goal of making predictions.
What is K-Means clustering?
K-Means Clustering is a type of unsupervised learning algorithm. The fundamentals of a clustering algorithm is that it finds groups of data points with similar features in a given data set. K-Means is one such clustering algorithm that groups data into “K” groups /clusters.
How does the K-Mean Clustering work you ask?
Well, the process is simple actually,
- Choose the number of clusters (K)
- Randomly place the K number of data points (initial centroids)
- Assign the points in the dataset to the closest centroid(This is normally done finding the Euclidean Distance of a point from the centroid )
- Recompute the new centroid by taking the mean of the data points in each cluster.
- Repeat 3 and 4 till the centroid doesn’t move.
Got the gist? No ? See if the below two gif makes you understand and if you still don’t get it watch this video
Do you feel like slapping your screen on your face ? Well take a deep breath and calm down. Recollect everything because we are going to get into the code.
Well since we have understood the concepts are relevant carry out the task. Let’s get coding!!
We will be using Google Collab. If you haven’t checked out Google Collab before head over here and check it out. After you done that, read “Google Colab Free GPU Tutorial” by fuat, you will be glad you did (+1 brownie/unicorn points for me ).
People with a python background would have noticed that Google Collab looks awfully familiar to Jupyter Notebook. Well , it does :P
Getting back to the topic , as always when I start a project I like to break down the process. So let break this down too
1.Read Image — This can be done using OpenCV
2. Segment out the skin from the image — This also can be done using OpenCV
3. Find the Dominant Colors — This is the main goal! We will be using the K-Mean Clustering Algorithm with the help of the Scikit-learn python package.
So we got a three-step process. (I vaguely remember reading, an action should be ideally completed in three steps process for a rich full UX. Well, that is off-topic….. Sigh got distracted again… )
A few things to know before you actually understand the code.
- OpenCV reads color images in “BGR Color Space”.
Well yes, that’s the reverse of “RGB Color Space”. If you are a web designer or designer in general RGB isn’t new to you. RGB goes as the “additive color space” Why you ask? Let’s rewind back to when we were kids. You mother or father or even your teacher might have taught you if you mix red and blue you get purple/magenta. Does that trigger your memory to remember “Three Primary Colors”?
So why does OpenCV not use the popular colorspace? Well, the answer is “History”. I’m not kidding, the developers chose to use the BGR colorspace because apparently during the time of development, BGR was the widely used color space by camera manufacturers and software vendors. You can read more about it from a post by LearnOpenCV found here.
2. Skin Segmentation is done using Thresholding in the HSV Color space
The HSV (Hue, Saturation, Value) is the model used to represent the RGB color in alignment to the human perception. The Hue denotes the Dominance of the Wavelength for the particular color, Saturation denotes the shades of the color and Value indicates the intensity of the color.
Thresholding is the process of creating a binary image by filtering out pixel based on a defined threshold. In simple terms take each pixel of an image, if that pixel is in the range of the “threshold values” then make it white if not make it black. In our context, the threshold values will be HSV values denoting the range of “skin color”. The HSV ranges can be obtained by doing a bit of trial and error. You can learn more about this from this article also by LearnOpenCV.
We will be using cv2.inRange() method to do the thresholding and using cv2.bitwise_and() to get the final subtracted image from the binary image.
And just a few other points to know about the libraries to be used
- OpenCV uses Numpy arrays as data types.
- OpenCV cannot handle transparent images by default. It treats transparent as black
- Since cv2.imshow() does not work in Google Collab, we will be using matplotlib’s plypot’s “imshow” method to display images.
- We will be using the awesome “imutils” library by Adrian Rosebrock Author of PyImageSearch.com.
Alright now that’s out of the way, let read through the code below before you check out the Google Collab Notebook.
This will be the final output of the code will the color information and a color bar like below.
Got something out of it? Well, why don’t you try it out? Here is the link to my Google Collab Notebook having detailed description of the code. Make a copy of it and play around with it yourself. You can also clone my repo and try it out locally as well.
Well, that brings us to an end of the article. If you have any questions or comments, please leave them below. Please do share and give a clap here on medium if you really found it useful.
Before I fist bump and leave, I would like to share a few resources you can read upon and explore the world of OpenCV.
- PyImageSearch — Be awesome at OpenCV, Python, deep learning, and computer vision
- Learn OpenCV ( C++ / Python )
- OpenCV 101: A Practical Guide to the Open Computer Vision Library by Matt Rever
- OpenCV with Python for Image and Video Analysis by sentdex
Alright then “fist bump” see you on my next post :D