I already have most of the example project for this post done already. You can check it out here:
Contribute to CoreCamera development by creating an account on GitHub.
This post marks the beginning of a series of posts that I am going to write on Core Image. So why Core Image? Working with graphics at a low level can be messy. The lower down you go, the further you’re straying from UIKit and Cocoa. Luckily for developers, Apple provides an abstraction/interface for interacting with lower level graphics processes. Core Image is one of these abstractions, and it is a library you will run across frequently when coding in Apple’s ecosystem. From editing static images to using filters on live video content, Core Image is versatile enough to handle a variety of use cases.
Before we get started, I just want to warn you that Core Image relies heavily on KVO which can make exploring it somewhat confusing. This barrier pales in comparison to the awesomeness of the world you’ll be able to work with and manipulate so I hope you stick with it.
Core Image Basics
Core Image is an image processing and analysis technology designed to provide near real-time processing for still and video images. It operates on image data types from the Core Graphics, Core Video, and Image I/O frameworks, using either a GPU or CPU rendering path. — Apple Docs
CIImage: is the image data format for working with image data in Core Image. It’s the recipe book/instructions for how to treat the image. CIFilters take CIImage as inputs and then passes them back out. Since CIImages get executed lazily, it’s not until they are put into a renderable format, that the recipe/instructions get followed.CIContext: is the evaluation context for graphics processing and analysis for Core Image processing with Quartz 2D, Metal, or OpenGL. CIContext is an immutable object that is thread-safe, but the CIFilter that are used with them are not.CIDetector: is the image process that notices and catalogs distinguishing features on an image. For a face that might be the eyes, ears, nose, etc. Or it could be a shape, like a box.Core Image Kernels: A Core Image Kernel is a small algorithm that is run on every single pixel. The Kernel uses the filter’s parameters and performs its algorithm based on that. Every filter gets wrapped around at least one kernel. The kernel is the functionality which is executed on every single pixel on the destination image. The kernel contains the processing algorithm that we run to generate our output image. The high volume of executions is the reason that GPU’s are designed the way that they are. Each execution might not seem like a lot, but in concert with all the pixels, being rendered simultaneously and it becomes clear we need something specialized to handle them.Warp Kernel: Warp kernels are designed for moving, deformation, translating images. If you’ve read the earlier series of posts that I wrote on ARKit, we used matrix transformations to place the nodes in 3D space. Warp kernels perform the operations on the pixels to accomplish these sorts of operation.Color Kernel: The color kernel is the kernel is responsible for working with color and only color. It receives an argument that is a component vector. These components are red, green, blue and alpha. If that sounds familiar, it’s because these values are often used with making custom colors using UIColor.
The Rendering Process
Low Level: Low level processing tasks are computed by the Kernel and a kernel routine must return a vector (vec4) which contains the result of mapping the source pixel to a destination pixel.High Level: High level process task are executed with Objective-C. The further the code is from having a direct effect on the hardware, the higher it's abstraction is. i.e. Kernels perform operations per pixel which is pretty close to the hardware, while a CIContext has a broad responsibility for operations and data which is higher up the chain.
CIFilter - A custom filter in CoreImage is a CIFilter for which you write a routine, called a kernel, that specifies the calculations to perform on each source image pixel.OpenGL Shader - If you’re not familiar with shaders, they are small programs that are executed on a per-vertex or per-pixel basis during drawing operations (there are also geometry shaders, which operate on geometric primitives, but they will not be covered in this tutorial).
If the two definitions above look very similar, that’s because Apple uses a modified version of OpenGL shader language to create filters. In OpenGL, you’ll often find that textures are used to give a 3D object some substance. For instance, you can have a sphere, and then cover it with a metallic texture to make it seem like a metal ball. Apple calls textures filters and you use them with the CIFilter data type.
In image processing a filter can be thought of as an algorithm which takes a number of inputs, including an image, and produces an output image. A filter is a wrapper around a kernel — which is the algorithm which is applied in turn to every single pixel in the image.
Hardware — GPU vs CPU Graphics Processing
GPU’s are in nowadays. Cryptocurrencies and machine learning use them, Apple is building them, but CPU’s still have their place. Let’s go over two scenarios which I think will shed light on the differences in core competencies between the CPU and GPU.
Scenario One: Think the differences between Tesla and a 18 wheeler truck. If you wanted a vehicle with the latest electronics that could almost drive itself at 120 mph, the 18 wheeler wouldn’t be of much use. It could go, but it’s got an old diesel engine and it doesn’t go much faster than highway speed.
Scenario Two: Now imagine you had to take all the boxcars off of a train and drive them to a wear house downtown. This is where the truck would beat the sports car hands down.
GPU: GPU’s were designed to do one thing very well, perform parallel processes on large sets of data. While a CPU has a few highly complex threaded cores, GPUs have many simple cores to perform many operations at the time. While each task is simple enough, it’s a the size of the data set being computed that necessitates the need for the GPU.CPU: While CPU’s are more than capable of handling their own when it comes to graphics processing (to some degree), they excel at executing complex logic and not in processing large data sets.
Core Image hides the details of low-level graphics processing by providing an easy-to-use application programming interface (API). You don’t need to know the details of OpenGL, OpenGL ES, or Metal to leverage the power of the GPU, nor do you need to know anything about Grand Central Dispatch (GCD) to get the benefit of multicore processing. Core Image handles the details for you. — Apple Docs
Metal: Apple has other graphics libraries as well. One of the coolest ones is Metal. Metal is a graphics rendering/parallel data computation engine. In practical terms, that means that it is fast. It provides you with a near-direct access to the GPU, and with the parallel computation, this allows Metal to go beyond the traditional realm of screen graphics and into the new frontiers of mobile machine learning.GLKit: Apple’s provides GLKit as a wrapper around the OpenGL library. GLKit makes it easy to access the functionality offered in OpenGL with having to stray too far out of the iOS programming paradigm.OpenGL: That brings us to OpenGL. OpenGL is a cross-language, cross-platform open source API for rending 2D and 3D vector graphics. The language itself is written in C and C++.Quartz 2D: Sometime’s it's called Quartz 2D and at others Core Graphics but at the end of the day its the same two-dimensional drawing engine use across Apple’s devices and platforms. Core Image sits between Core Graphics and lower level GPU/CPU processing.
If GPU’s and CPU’s are trucks and sports cars, then Metal is Apple’s attempt to make a hybrid sports car-truck. Metal is the highest-performance library out of all of them and, as I mentioned above, the applications go way beyond pure graphics operations.
Now that we’ve gotten most of the background information out of the way, I think this is a good place to wrap things up. In the next post we’ll dive into the code!