Computer Vision
Building a simple lane detection iOS app using OpenCV
Have you ever wanted to build an app that add stickers to a face? Or maybe an app that can read text on boards for visually impaired users?
Apps with features such as those mentioned above use some form of computer vision algorithm; a piece of code that tries to make sense of what the iOS device is able to see.
There are some frameworks and libraries out there that are able to achieve face detection or text extraction in a few lines of code without needing to go into the details of how they achieve it. However in some cases the features offered by those frameworks and libraries might not satisfy your needs.
In cases where you need to implement your own computer vision algorithm, the most popular tool to help you achieve your goal is OpenCV.
OpenCV is an open source library that contains functions aimed at real-time computer vision.
In this post I will show you how to use OpenCV in an iOS app. We will create an iOS app that will detect the road lane in which the user is driving. Computer Vision techniques and how to do computer vision is out of scope in this post. We will learn how to consume OpenCV, which is a C++ library, from within our Swift code inside an iOS app.
The computer vision algorithm we will use is based on Kemal Ficici’s Hackster.io project. I have ported the Python computer vision algorithm from Kemal’s post to C++ and will be providing it to you in this post.
Getting started
In this section we will cover the steps to build an iOS app that contains a view controller which will display the back camera feed of the iOS device and overlay any road lane on top of the camera feed on the screen.
To achieve that we will:
- Create SimpleLaneDetection app project
- Process frames from the back camera
- Import OpenCV
- Insert lane detector algorithm into the project
- Consume lane detector algorithm from Swift
- Display lane detection results
Create SimpleLaneDetection project
Let’s start by creating a new Xcode project. Open Xcode and then from menu select File > New > Project… Next, select Single View App template and then click on Next.
Name the project SimpleLaneDetection and then click Next. Finally store the project wherever convenient for you and then click Finish.
The Single View App template creates an app with a single blank screen ready to run.
Process frames from the back camera
In this section we will show the feed from the back camera of our iOS devices on the screen.
On the previous step when we created the project from template. The template included a single blank screen named ViewController
. Inside the ViewController
we will process the camera feed.
Let’s open ViewController.swift
. We will first need access to the code that will allow us access to the camera. We will make use of the AVFoundation
framework to do so. Add the following line in ViewController
after import UIKit
:
import AVFoundation
AVFoundation
is a framework by Apple already included within iOS that will allow us to communicate with device’s camera. The following steps below will leverage code included within the AVFoundation
framework. These are classes usually preappended with AV
.
Next we will need to create an instance of AVCaptureSession
which will coordinate inputs, such as the camera and/or microphone, into outputs such as video, frames or still image capture. Let’s create a property to hold an instance of AVCaptureSession
in our ViewController
:
import UIKit
import AVFoundationclass ViewController: UIViewController { private var captureSession: AVCaptureSession = AVCaptureSession()
...
Next let’s add the back camera of our iOS device as an input of our capture session. Add the following function to our ViewController
:
private func addCameraInput() {
guard let device = AVCaptureDevice.DiscoverySession(
deviceTypes: [.builtInWideAngleCamera, .builtInDualCamera, .builtInTrueDepthCamera],
mediaType: .video,
position: .back).devices.first else {
fatalError("No back camera device found, please make sure to run SimpleLaneDetection in an iOS device and not a simulator")
} let cameraInput = try! AVCaptureDeviceInput(device: device)
self.captureSession.addInput(cameraInput)
}
Note: we won’t be able to run our app on iOS simulators; they don’t have access to cameras.
Let’s call our addCameraInput()
function from viewDidLoad()
function.
override func viewDidLoad() {
super.viewDidLoad()
self.addCameraInput() // add this line
}
Access to the camera requires user permission. I won’t delve into managing permissions. In this tutorial we assume that access to the camera will always be granted by the user. However we still need to let the operating system know that we need access to the camera. Open Info.plist
and add a new key NSCameraUsageDescription
with String value Required for detecting road lanes
. As you finishing entering the key, Xcode will automatically will replace NSCameraUsageDescription
with Privacy — Camera Usage Description
. Your Info.plist
should look like the following:
We now have access to the camera. Next let’s access each image frame from the camera stream.
To access frames in real time we have to create an instance of AVCaptureVideoDataOutput
class. Furthermore we have tell it to delegate the camera frames to our ViewController
, where we will process them. But before we can do that our ViewController
must be able to receive those frames. Let’s make our ViewController
conform to AVCaptureVideoDataOutputSampleBufferDelegate
protocol:
class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {
...
}
Next let’s add the function that will receive the frames in our ViewController
.
func captureOutput(
_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) { // here we can process the frame
print("did receive frame")
}
Now our ViewController
is ready to receive and process frames. Let’s create an instance of the AVCaptureVideoDataOutput
which will output the video frames from the capture session to wherever we want to process the frame. At the top of the ViewController
declare the following property:
private let videoDataOutput = AVCaptureVideoDataOutput()
Let’s create a function where we will configure the videoDataOutput
. We will tell it where to send the frames from the camera and where to get the frames from: the capture session. Add the following function to the ViewController
.
private func getFrames() {
videoDataOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString) : NSNumber(value: kCVPixelFormatType_32BGRA)] as [String : Any]
videoDataOutput.alwaysDiscardsLateVideoFrames = true
videoDataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "camera.frame.processing.queue"))
self.captureSession.addOutput(videoDataOutput) guard let connection = self.videoDataOutput.connection(with: AVMediaType.video),
connection.isVideoOrientationSupported else { return } connection.videoOrientation = .portrait
}
On third line we tell the video output who to deliver the frame to by setting the sampleBufferDelegate
. In this case self
is the instance of the ViewController
. Additionally we tell the videoDataOutput
that we want to process the frame in a new queue. If you aren’t familiar with DispatchQueue
‘s, just think of them as workers. The main worker has the responsibility to manage the user interface, any additional intensive task to the main queue can lead to a slow app or worse your app could crash. So it’s a good idea to process frames in another queue.
Let’s now tell the capture session by calling our getFrames()
function at the end of the viewDidLoad()
function.
Lastly let’s verify that we do receive frames. At the end viewDidLoad
add After insert self.captureSession.startRunning()
to start coordinating the input and outputs that we previously configured. viewDidLoad
should look like the code below:
override func viewDidLoad() {
super.viewDidLoad()
self.addCameraInput()
self.getFrames()
self.captureSession.startRunning()
}
Run the app on a device. Watch the console (View > Debug Area > Show Debug Area), you should be able to see “did receive frame” printed out continuously whilst the app is running.
Now we are able to receive and process frames from the camera feed.
Import OpenCV to the project
In the previous section we enabled our app to receive and process frames from the back camera of an iOS device. Next we need to detect the road lane on the frame. However the computer vision algorithm to do lane detection requires OpenCV. Therefore in this section we will first fetch and install OpenCV in our iOS app.
Let’s download OpenCV 3.4.5. Once downloaded let’s import it into our SimpleLaneDetection app target. Drag and drop opencv2.framework
into the project.
Once opencv2.framework
is dropped into the project, Xcode will prompt a window with the options for adding opencv2.framework
. For Destination check Copy items if needed. For the Added folders option select the Create groups option. For the Add to targets option check SimpleLaneDetection target. Click on Finish.
Our selection of the adding opencv2.framework
to the project options will copy opencv2.framework into our project and link the framework to our app.
You should find opencv2.framework
in Linked Frameworks and Libraries under General tab for the SimpleLaneDetection app target configuration.
Insert lane detection algorithm
Let’s add the code to detect where the lane is in the image frame.
Let’s add C++ header and implementation files to our app. Don’t worry you don’t need to have C++ knowledge. The C++ computer vision algorithm will be provided.
From menu click on File > New > File... Next search and select for C++ File template.
Click next and name it LaneDetector and check Also create header file.
Finally click Next and then Create. Xcode will then prompt you with some options to configure the app to use of multiple languages. Click on Create Bridging Header option.
The bridging header file is important as it will allow us to consume our lane detector algorithm by allowing different languages to talk to each other. For now know that it will be needed later on. We will revisit the bridging header later on this post.
Let’s open LaneDetector.hpp
and, copy and paste the code below:
Next open LaneDetector.cpp
and, copy and paste the code below:
Consume lane detection algorithm using Swift
In the previous section we added the lane detector algorithm. The lane detector algorithm overlays the road lane on top of the camera feed and then returns the combined image. However we haven’t yet consumed that code. So let’s just do that in this section.
Our Swift code is not able to consume C++ code (at least not at the time of writing). However Objective-C is. Furthermore we can consume Objective-C code through Swift. So let’s create Objective-C code to bridge between Swift and C++.
Start by adding a new header file to the project. Select File > New > File… and then select Header file from the iOS template.
Next name it LaneDetectorBridge. Copy and paste the code below in LaneDetectorBridge.h
:
Here basically we are declaring a single method in our LangeDetectorBridge
class which will take an UIImage
instance and return a UIImage
instance with the lane overlayed.
Next create an Objective-C file that will implement the LangeDetectorBridge
interface. Select File > New > File.. and then select Objective-C File from iOS template. Name it LangeDetectorBridge
.
Once create edit the file name of the recently created LangeDetectorBridge.m
and add an extra m
. Your file should be named LaneDetectorBridge.mm
.
The extra m
will tell Xcode that this an Objective-C++ file. LaneDetectorBridge
is now allowed to use C++ from within.
Next let’s add the code to bridge Swift to our C++ algorithm and back. Copy and past the code below to LaneDetectorBridge.mm
:
LaneDetectorBridge
converts UIImage
s into OpenCV image representation. Then it runs lane detection which returns an image with lane overlayed on top of it. And finally converts the OpenCV image representation back to UIImage
.
One more step before we can consume LaneDetectorBridge
from our Swift code is to tell Xcode to make that class accessible to Swift. We do so by declaring the header files to be accessible in our bridging file. Open SimpleLaneDetection-Bridging-Header.h
and add the following line:
#import "LaneDetectorBridge.h"
And lastly we have to convert frames coming from the camera stream into UIImage’s and then calling our LaneDetectorBridge
. Replace the contents of the captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection)
function in ViewController
with the following code:
guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }CVPixelBufferLockBaseAddress(imageBuffer, CVPixelBufferLockFlags.readOnly)let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)let colorSpace = CGColorSpaceCreateDeviceRGB()var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue
bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValuelet context = CGContext(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo)guard let quartzImage = context?.makeImage() else { return }
CVPixelBufferUnlockBaseAddress(imageBuffer, CVPixelBufferLockFlags.readOnly)let image = UIImage(cgImage: quartzImage)
The code above will convert the camera frame bitmap into an UIImage
.
We are finally ready to call our LaneDetectorBridge
. Add the following line at the end of the captureOutput
function:
let imageWithLaneOverlay = LaneDetectorBridge().detectLane(in: image)
Display lane detection results
In the previous section we started processing the images coming from the back camera of an iOS device. The next step is to display those processed images with lanes overlayed. For that let’s add a UIImageView
to our ViewController
where we will display such images on the screen for the user to view.
Open Main.storyboard
. Click on the library button located on the toolbar.
Once the object library is open search for UIImageView
.
Next drag and drop Image View
into the blank canvas in Main.storyboard
.
Once the UIImageView
is placed on the canvas, maintain the control⌃ key and then drag the UIImageView
a blank area of the canvas.
Notice the UIImageView
itself will not move. However once you let go of the mouse a layout pop up menu will appear.
On the layout pop up menu we are able to set out layout constraints on the UIImageView
relative to the canvas holding this view. Using the command ⌘ key select Center Horizontally in Safe Area
, Center Vertically in Safe Area
and Equal Heights
. This will make the UIImageView
cover the height of the screen whilst being centred in it. As for the width we will make the UIImageView
automatically resize respecting the aspect ratio of the image contained within it.
Select the UIImageView
and then open the attributes inspector (View > Inspectors > Show Attributes Inspector).
In the attributes inspector set Aspect Fit
for the Content Mode
option.
Let’s create a reference to UIImageView
so we can set the image of the UIImageView
from our ViewController
programmatically. Open the assistant editor (View > Assistant Editor > Show Assistant Editor). Next holding the control ⌃ key drag and drop the UIImageView
from Main.storyboard
to the inside ofViewController
class.
Once you let go a new pop up will appear with options to configure the reference you are creating in your ViewController
class for the UIImageView
.
Name the reference imageView
and then click on Connect
.
The last step to do is to set the image of the UIImageView
to the one outputted by the lane detector algorithm. At the end of captureOutput
method in ViewController
add:
DispatchQueue.main.async {
self.imageView.image = imageWithLaneOverlay
}
If you recall on the Process frames from the camera section we told the video output to that we wanted to process the frames on a queue which was not the main queue; the one in charge to handle the user interface. By setting the image to the image view displayed to the user we are updating the user interface. Therefore we have to tell the main worker to do so.
And thats all 🎉! Run the app, point the camera to a road lane and see it in action!
Summary
In this post we have learnt how to use OpenCV to process images and then display the results back.
We learnt that consuming C++ code from Swift is not so straight forward. Swift can’t talk to C++. However Swift can talk to Objective-C. Objective-C can talk to C++ using a special linking language between them called Objective-C++. On the outside Objective-C++ looks like regular Objective-C. On the inside however Objective-C++ is able to call C++ code.
Final notes
The chosen computer vision algorithm for this post is untested. Furthermore Kemal Ficici also offers a curved lane detection algorithm which I will attempt to convert to C++ in a future post.
You can find the full source code for this post here.
If you liked this post please don’t forget to clap. Stay tuned for more posts on iOS development! Follow me on Twitter or Medium!