iOS Image Face Centering Using Apple’s Vision Framework
In this tutorial, you’ll learn how to utilize Apple’s Vision Framework in your iOS app to achieve a better user experience by detecting and centering faces in your images.
Problem
During the development of multiple Ancestry app features, our team ran into a similar issue. We often had to display images of different aspect ratios within an ImageView
for which the aspect ratio was fixed.
The image below illustrates the problem this creates. On the left, you can see the initial uncropped image we get from our server. In the middle, you can see what happens if we try to display that image within an ImageView
that has a fixed aspect ratio. The boundaries of the ImageView
are marked in red.
This current ImageView
has an aspectFill
ratio so it crops the top and the bottom parts of our vertical image. The user only sees the content inside of the red border. As a result, our users don’t see the faces of people we’re trying to show them. This significantly hurts the experience of our product. Not good.
So how we can fix this? Wouldn’t it be nice to somehow identify the face position on our image and then crop the image accordingly? The desired result is illustrated on the right in green.
Okay, great… But how can we achieve this desired behavior? The good news is that with iOS 11 Apple released a new Vision Framework that has face detection capabilities built-in!
Solution
Let’s utilize Apple’s Vision Framework to detect and center faces in our images. First of all, we should understand what the Vision API has to offer in our case:
VNImageRequestHandler allows you to process image analysis requests by calling itsperform(_:)
function and passing an array of requests to conform to VNRequest
protocol.
VNDetectFaceRectanglesRequest is one of those requests. This specific request allows us to find faces and their coordinates within an image. Exactly what we need in our case.
VNFaceObservation is a type of observation that results from a VNDetectFaceRectanglesRequest
. This object contains the necessary facial-feature information.
Step One. Detecting the facial-feature information.
Let’s start with getting a VNFaceObservation
for each of the faces on our image and not worry about the image cropping logic yet. We are going to add the following extension to CGImage
:
Note: Vision API is only available since iOS 11. As you can see in the code snippet, we have to mark our functions with @available(iOS 11.0, *)
Here’s what’s going on in the above code:
- First of all, we create a
VNDetectFaceRectanglesRequest
. - If for some reason our request returns an error, let’s return
.failure
. - If no faces were found, return
.notFound
result. - Now we can finally iterate through all of the results and add them to our
faces
array. We’ll need this array in Step Two. - For now, let’s just print the total number of faces we found and return the original uncropped image in our
.success
. - Finally, we call our
VNDetectFaceRectanglesRequest
usingperform(_:)
function ofVNImageRequestHandler
.
Step Two. Cropping the initial image.
Now let’s add some more logic to crop the initial image based on the facial-feature information we got from Step One:
Let’s walk through the above code:
- The responsibility of this function is to return a
CGRect
that tells us how to crop our initial image. - Let’s look into how this function works. First of all, we initialize our
total...
variables. These are important for cases when more than one face was detected in our image. minX
andminY
variables are used to keep track of the farthest bottom-left face in our image. We’ll need those coordinates later.- Let’s now iterate through all the faces we were able to identify. P
- Why do we multiply
face.boundingBox.width
byCGFloat(width)
?face.boundingBox.width
returns a number between 0 and 1 that represents the face width in proportion to the full width of the image. We multiply that number byCGFloat(width)
to get the absolute width of the face. Same issue withface.boundingBox.height
andface.boundingBox.origin.x
. - This coordinate space transformation might also seem a bit confusing. First, we do
1 — face.boundingBox.origin.y
to get the relative y position from the top instead of the bottom of our image. We then multiply that number byCGFloat(height)
and subtracth
to correctly represent the absolute y coordinate in the flipped-coordinate space for ourCGRect
initialization later on. In the flipped-coordinate space, the origin is in the upper-left corner and the rectangle extends towards the lower-right corner. - Calculate average width, height, x, and y coordinates of faces by dividing our
total...
variables by the number of faces in the image. - This line might seem a bit tricky. Here we calculate how much extra space we want to add around the face(s) that were found. For that, we calculate the distance between our
avgX
(which represents the average x coordinate of all faces in the photo) andminX
(left-most face coordinate). We then add a custommargin
parameter that we will talk about later in this article. - We can now use our offset and averages to create a
CGRect
that represents how we need to crop our initial image. - Now let’s use that
CGRect
to crop the initial image. - Return
.notFound
ifcropping(to rect: CGRect)
returnednil
- Return
.success
with our new face-centered image!
Let’s talk more about the mysterious margin
parameter in our faceCrop(margin: CGFloat = 200, ...)
method. This param is necessary in order to specify the amount of extra space on each side of the face before we return the final cropped image. As you can see in the table below, this extra margin prevents us from returning an image just containing the face of the person:
Implementation
Now we can go to our ViewController
and take advantage of this new extension:
There are a few things to note here:
- First of all, we get the
CGImage
from ourUIImage
. If that doesn't work, we just display our original uncropped image. - We then switch to a global queue with the highest
.userInteractive
priority to call ourfaceCrop()
method. - If the face cropping worked successfully, we then switch back to the main queue and display the new cropped image.
- If our
faceCrop()
method didn’t find any faces or crop the image for some other reason, we just display the original uncropped image.
That’s it. We have successfully detected and centered faces in our image!
Conclusion
Our team uses face centering on both Ancestry and AncestryDNA iOS apps. This approach has helped us significantly improve the quality of the images we display to our users. The best part is, we were able to achieve that in less than a 100 lines of code!
We hope this article will inspire you to utilize this approach and improve the user experience of your app as well. Feel free to try this code in your project and please let us know what you think!
Where to Go From Here?
Consider visiting our FaceCrop repo on GitHub.
You can also check out this WWDC 2019 video for a deeper dive into VMImageRequestHandler
and VNFaceObservation
functionality.
Finally, you can always take advantage of Apple’s documentation to learn more about other Vision API features.
Big thanks to Anastasios Grigoriou for his contribution to this project.
If you’re interested in joining Ancestry, we’re hiring! Feel free to check out our careers page for more info.