Computer Vision

Building a simple lane detection iOS app using OpenCV

Published in

Onfido Product and Tech

12 min readApr 29, 2019

Have you ever wanted to build an app that add stickers to a face? Or maybe an app that can read text on boards for visually impaired users?

Apps with features such as those mentioned above use some form of computer vision algorithm; a piece of code that tries to make sense of what the iOS device is able to see.

There are some frameworks and libraries out there that are able to achieve face detection or text extraction in a few lines of code without needing to go into the details of how they achieve it. However in some cases the features offered by those frameworks and libraries might not satisfy your needs.

In cases where you need to implement your own computer vision algorithm, the most popular tool to help you achieve your goal is OpenCV.

OpenCV is an open source library that contains functions aimed at real-time computer vision.

In this post I will show you how to use OpenCV in an iOS app. We will create an iOS app that will detect the road lane in which the user is driving. Computer Vision techniques and how to do computer vision is out of scope in this post. We will learn how to consume OpenCV, which is a C++ library, from within our Swift code inside an iOS app.

The computer vision algorithm we will use is based on Kemal Ficici’s Hackster.io project. I have ported the Python computer vision algorithm from Kemal’s post to C++ and will be providing it to you in this post.

Getting started

In this section we will cover the steps to build an iOS app that contains a view controller which will display the back camera feed of the iOS device and overlay any road lane on top of the camera feed on the screen.

To achieve that we will:

Create SimpleLaneDetection app project
Process frames from the back camera
Import OpenCV
Insert lane detector algorithm into the project
Consume lane detector algorithm from Swift
Display lane detection results

Create SimpleLaneDetection project

Let’s start by creating a new Xcode project. Open Xcode and then from menu select File > New > Project… Next, select Single View App template and then click on Next.

Name the project SimpleLaneDetection and then click Next. Finally store the project wherever convenient for you and then click Finish.

The Single View App template creates an app with a single blank screen ready to run.

Process frames from the back camera

In this section we will show the feed from the back camera of our iOS devices on the screen.

On the previous step when we created the project from template. The template included a single blank screen named ViewController. Inside the ViewController we will process the camera feed.

Let’s open ViewController.swift. We will first need access to the code that will allow us access to the camera. We will make use of the AVFoundation framework to do so. Add the following line in ViewController after import UIKit:

import AVFoundation

AVFoundation is a framework by Apple already included within iOS that will allow us to communicate with device’s camera. The following steps below will leverage code included within the AVFoundation framework. These are classes usually preappended with AV.

Next we will need to create an instance of AVCaptureSession which will coordinate inputs, such as the camera and/or microphone, into outputs such as video, frames or still image capture. Let’s create a property to hold an instance of AVCaptureSession in our ViewController:

import UIKit
import AVFoundationclass ViewController: UIViewController {    private var captureSession: AVCaptureSession = AVCaptureSession()
    ...

Next let’s add the back camera of our iOS device as an input of our capture session. Add the following function to our ViewController:

private func addCameraInput() {
    guard let device = AVCaptureDevice.DiscoverySession(
        deviceTypes: [.builtInWideAngleCamera, .builtInDualCamera, .builtInTrueDepthCamera],
        mediaType: .video,
        position: .back).devices.first else {
            fatalError("No back camera device found, please make sure to run SimpleLaneDetection in an iOS device and not a simulator")
    }    let cameraInput = try! AVCaptureDeviceInput(device: device)
    self.captureSession.addInput(cameraInput)
}

Note: we won’t be able to run our app on iOS simulators; they don’t have access to cameras.

Let’s call our addCameraInput() function from viewDidLoad() function.

override func viewDidLoad() {
    super.viewDidLoad()
    self.addCameraInput() // add this line
}

Access to the camera requires user permission. I won’t delve into managing permissions. In this tutorial we assume that access to the camera will always be granted by the user. However we still need to let the operating system know that we need access to the camera. Open Info.plist and add a new key NSCameraUsageDescription with String value Required for detecting road lanes. As you finishing entering the key, Xcode will automatically will replace NSCameraUsageDescription with Privacy — Camera Usage Description. Your Info.plist should look like the following:

Info.plist with camera usage description

We now have access to the camera. Next let’s access each image frame from the camera stream.

To access frames in real time we have to create an instance of AVCaptureVideoDataOutput class. Furthermore we have tell it to delegate the camera frames to our ViewController, where we will process them. But before we can do that our ViewController must be able to receive those frames. Let’s make our ViewController conform to AVCaptureVideoDataOutputSampleBufferDelegate protocol:

class ViewController: UIViewController, AVCaptureVideoDataOutputSampleBufferDelegate {
    ...
}

Next let’s add the function that will receive the frames in our ViewController.

func captureOutput(
    _ output: AVCaptureOutput,
    didOutput sampleBuffer: CMSampleBuffer,
    from connection: AVCaptureConnection) {     // here we can process the frame
    print("did receive frame")
}

Now our ViewController is ready to receive and process frames. Let’s create an instance of the AVCaptureVideoDataOutput which will output the video frames from the capture session to wherever we want to process the frame. At the top of the ViewController declare the following property:

private let videoDataOutput = AVCaptureVideoDataOutput()

Let’s create a function where we will configure the videoDataOutput. We will tell it where to send the frames from the camera and where to get the frames from: the capture session. Add the following function to the ViewController.

private func getFrames() {
    videoDataOutput.videoSettings = [(kCVPixelBufferPixelFormatTypeKey as NSString) : NSNumber(value: kCVPixelFormatType_32BGRA)] as [String : Any]
    videoDataOutput.alwaysDiscardsLateVideoFrames = true
    videoDataOutput.setSampleBufferDelegate(self, queue: DispatchQueue(label: "camera.frame.processing.queue"))
    self.captureSession.addOutput(videoDataOutput)    guard let connection = self.videoDataOutput.connection(with: AVMediaType.video),
        connection.isVideoOrientationSupported else { return }    connection.videoOrientation = .portrait
}

On third line we tell the video output who to deliver the frame to by setting the sampleBufferDelegate. In this case self is the instance of the ViewController. Additionally we tell the videoDataOutput that we want to process the frame in a new queue. If you aren’t familiar with DispatchQueue‘s, just think of them as workers. The main worker has the responsibility to manage the user interface, any additional intensive task to the main queue can lead to a slow app or worse your app could crash. So it’s a good idea to process frames in another queue.

Let’s now tell the capture session by calling our getFrames() function at the end of the viewDidLoad() function.

Lastly let’s verify that we do receive frames. At the end viewDidLoad add After insert self.captureSession.startRunning() to start coordinating the input and outputs that we previously configured. viewDidLoad should look like the code below:

override func viewDidLoad() {
    super.viewDidLoad()
    self.addCameraInput()
    self.getFrames()
    self.captureSession.startRunning()
}

Run the app on a device. Watch the console (View > Debug Area > Show Debug Area), you should be able to see “did receive frame” printed out continuously whilst the app is running.

Now we are able to receive and process frames from the camera feed.

Import OpenCV to the project

In the previous section we enabled our app to receive and process frames from the back camera of an iOS device. Next we need to detect the road lane on the frame. However the computer vision algorithm to do lane detection requires OpenCV. Therefore in this section we will first fetch and install OpenCV in our iOS app.

Let’s download OpenCV 3.4.5. Once downloaded let’s import it into our SimpleLaneDetection app target. Drag and drop opencv2.framework into the project.

Once opencv2.framework is dropped into the project, Xcode will prompt a window with the options for adding opencv2.framework. For Destination check Copy items if needed. For the Added folders option select the Create groups option. For the Add to targets option check SimpleLaneDetection target. Click on Finish.

Adding opencv2.framework to project options

Our selection of the adding opencv2.framework to the project options will copy opencv2.framework into our project and link the framework to our app.

You should find opencv2.framework in Linked Frameworks and Libraries under General tab for the SimpleLaneDetection app target configuration.

Insert lane detection algorithm

Let’s add the code to detect where the lane is in the image frame.

Let’s add C++ header and implementation files to our app. Don’t worry you don’t need to have C++ knowledge. The C++ computer vision algorithm will be provided.

From menu click on File > New > File... Next search and select for C++ File template.

Add new file to app using C++ File template

Click next and name it LaneDetector and check Also create header file.

Name the file LaneDetector, check header file creation checkbox

Finally click Next and then Create. Xcode will then prompt you with some options to configure the app to use of multiple languages. Click on Create Bridging Header option.

The bridging header file is important as it will allow us to consume our lane detector algorithm by allowing different languages to talk to each other. For now know that it will be needed later on. We will revisit the bridging header later on this post.

Let’s open LaneDetector.hpp and, copy and paste the code below:

LaneDetector.hpp

Next open LaneDetector.cpp and, copy and paste the code below:

LaneDetector.cpp

Consume lane detection algorithm using Swift

In the previous section we added the lane detector algorithm. The lane detector algorithm overlays the road lane on top of the camera feed and then returns the combined image. However we haven’t yet consumed that code. So let’s just do that in this section.

Our Swift code is not able to consume C++ code (at least not at the time of writing). However Objective-C is. Furthermore we can consume Objective-C code through Swift. So let’s create Objective-C code to bridge between Swift and C++.

Start by adding a new header file to the project. Select File > New > File… and then select Header file from the iOS template.

Next name it LaneDetectorBridge. Copy and paste the code below in LaneDetectorBridge.h:

Here basically we are declaring a single method in our LangeDetectorBridge class which will take an UIImage instance and return a UIImage instance with the lane overlayed.

Next create an Objective-C file that will implement the LangeDetectorBridge interface. Select File > New > File.. and then select Objective-C File from iOS template. Name it LangeDetectorBridge.

Once create edit the file name of the recently created LangeDetectorBridge.m and add an extra m. Your file should be named LaneDetectorBridge.mm.

Add an extra “m” to the file extension of LaneDetectorBridge.m

The extra m will tell Xcode that this an Objective-C++ file. LaneDetectorBridge is now allowed to use C++ from within.

Next let’s add the code to bridge Swift to our C++ algorithm and back. Copy and past the code below to LaneDetectorBridge.mm:

LaneDetectorBridge.mm

LaneDetectorBridge converts UIImages into OpenCV image representation. Then it runs lane detection which returns an image with lane overlayed on top of it. And finally converts the OpenCV image representation back to UIImage.

One more step before we can consume LaneDetectorBridge from our Swift code is to tell Xcode to make that class accessible to Swift. We do so by declaring the header files to be accessible in our bridging file. Open SimpleLaneDetection-Bridging-Header.h and add the following line:

#import "LaneDetectorBridge.h"

And lastly we have to convert frames coming from the camera stream into UIImage’s and then calling our LaneDetectorBridge. Replace the contents of the captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) function in ViewController with the following code:

guard let  imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }CVPixelBufferLockBaseAddress(imageBuffer, CVPixelBufferLockFlags.readOnly)let baseAddress = CVPixelBufferGetBaseAddress(imageBuffer)
let bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer)
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)let colorSpace = CGColorSpaceCreateDeviceRGB()var bitmapInfo: UInt32 = CGBitmapInfo.byteOrder32Little.rawValue
bitmapInfo |= CGImageAlphaInfo.premultipliedFirst.rawValue & CGBitmapInfo.alphaInfoMask.rawValuelet context = CGContext(data: baseAddress, width: width, height: height, bitsPerComponent: 8, bytesPerRow: bytesPerRow, space: colorSpace, bitmapInfo: bitmapInfo)guard let quartzImage = context?.makeImage() else { return }
CVPixelBufferUnlockBaseAddress(imageBuffer, CVPixelBufferLockFlags.readOnly)let image = UIImage(cgImage: quartzImage)

The code above will convert the camera frame bitmap into an UIImage.

We are finally ready to call our LaneDetectorBridge. Add the following line at the end of the captureOutput function:

let imageWithLaneOverlay = LaneDetectorBridge().detectLane(in: image)

Display lane detection results

In the previous section we started processing the images coming from the back camera of an iOS device. The next step is to display those processed images with lanes overlayed. For that let’s add a UIImageView to our ViewController where we will display such images on the screen for the user to view.

Open Main.storyboard. Click on the library button located on the toolbar.

Once the object library is open search for UIImageView.

Next drag and drop Image View into the blank canvas in Main.storyboard.

Once the UIImageView is placed on the canvas, maintain the control⌃ key and then drag the UIImageView a blank area of the canvas.

Hold control key and drag the UIImageView onto the blank canvas

Notice the UIImageView itself will not move. However once you let go of the mouse a layout pop up menu will appear.

On the layout pop up menu we are able to set out layout constraints on the UIImageView relative to the canvas holding this view. Using the command ⌘ key select Center Horizontally in Safe Area, Center Vertically in Safe Area and Equal Heights. This will make the UIImageView cover the height of the screen whilst being centred in it. As for the width we will make the UIImageView automatically resize respecting the aspect ratio of the image contained within it.

Select the UIImageView and then open the attributes inspector (View > Inspectors > Show Attributes Inspector).

In the attributes inspector set Aspect Fit for the Content Mode option.

Let’s create a reference to UIImageView so we can set the image of the UIImageView from our ViewController programmatically. Open the assistant editor (View > Assistant Editor > Show Assistant Editor). Next holding the control ⌃ key drag and drop the UIImageView from Main.storyboard to the inside ofViewController class.

Control + drag and drop UIImageView into ViewController class

Once you let go a new pop up will appear with options to configure the reference you are creating in your ViewController class for the UIImageView.

Name the reference imageView and then click on Connect.

The last step to do is to set the image of the UIImageView to the one outputted by the lane detector algorithm. At the end of captureOutput method in ViewController add:

DispatchQueue.main.async {
    self.imageView.image = imageWithLaneOverlay
}

If you recall on the Process frames from the camera section we told the video output to that we wanted to process the frames on a queue which was not the main queue; the one in charge to handle the user interface. By setting the image to the image view displayed to the user we are updating the user interface. Therefore we have to tell the main worker to do so.

And thats all 🎉! Run the app, point the camera to a road lane and see it in action!

Summary

In this post we have learnt how to use OpenCV to process images and then display the results back.

We learnt that consuming C++ code from Swift is not so straight forward. Swift can’t talk to C++. However Swift can talk to Objective-C. Objective-C can talk to C++ using a special linking language between them called Objective-C++. On the outside Objective-C++ looks like regular Objective-C. On the inside however Objective-C++ is able to call C++ code.

Final notes

The chosen computer vision algorithm for this post is untested. Furthermore Kemal Ficici also offers a curved lane detection algorithm which I will attempt to convert to C++ in a future post.

You can find the full source code for this post here.

If you liked this post please don’t forget to clap. Stay tuned for more posts on iOS development! Follow me on Twitter or Medium!