Understanding Android camera capture sessions and requests

Published in

Android Developers

6 min readSep 5, 2018

Following up on the previous blog post about camera enumeration, let’s take a look at two major components of the Android camera framework: capture sessions and capture requests.

One CameraDevice, multiple streams

A single Android device can have multiple cameras. Each camera is a CameraDevice, and a CameraDevice can output multiple streams simultaneously. Why would we do that? Well, one stream might be optimized for a specific use-case, such as displaying a viewfinder, while others might be used to take a photo or make a video recording. We can think of the streams as parallel pipelines that process raw frames coming out of the camera, one frame at a time:

Illustration from Building a Universal Camera App (Google I/O ‘18)

Parallel processing implies that there could be performance limits depending on the available processing power from the CPU, GPU, or something else. If a pipeline can’t keep up with the incoming frames it starts dropping them.

Note that each pipeline has its own output format. The raw data coming in is automatically transformed into the appropriate output format by implicit logic associated with each pipeline. More about this later.

The CameraDevice can be used to create a CameraCaptureSession, which will be specific to that CameraDevice. A CameraDevice must receive a frame configuration for each output raw frame via the CameraCaptureSession. One configuration in, one raw frame out. The configuration specifies camera attributes such as autofocus, aperture, effects, and exposure. Due to hardware constraints, only a single configuration can be active in the camera sensor at any given time; this is called the active configuration.

A CameraCaptureSession describes all the possible pipelines available to the CameraDevice. Once a session is created, you cannot add or remove pipelines. The CameraCaptureSession maintains a queue of CaptureRequests which become the active configuration.

A CaptureRequest adds a configuration to the queue and selects one or more (or all) of the available pipelines to receive a frame from the CameraDevice. You can send many capture requests over the life of a capture session. Each request can change the active configuration and set of output pipelines that will receive the raw image.

Creating a CameraCaptureSession

To create a camera session, we need to provide it with one or more buffers where output frames can be written. Each buffer represents a pipeline. This must be done before you start using the camera so the framework can configure the device’s internal pipelines and allocate memory buffers for sending frames to the desired output targets.

Here is how we could prepare a camera session with two output buffers, one belonging to a SurfaceView and another to an ImageReader:

Note that at this point we have not defined the camera’s active configuration. Once the session has been configured, we can create and dispatch capture requests to do that.

Remember that we said above that each pipeline knows how to transform the input written to its buffer into the appropriate format? That transformation is determined by the type of each target, which must be a Surface. The Android framework knows how to convert a raw image in the active configuration into a format appropriate for each target. The conversion is controlled by the pixel format and size of the particular Surface. The framework tries to do its best, but a Surface may have a configuration that won’t work, in which case some bad things might happen: the session can’t be created, you’ll get a runtime error when you dispatch a request, or performance might degrade. The framework provides guarantees for specific combinations of device, surface, and request parameters. We’ll discuss all this in a future post. (In the meantime, if you’re curious, read the documentation for createCaptureSession.)

One-shot CaptureRequests

The configuration used for each frame is encoded in a CaptureRequest, which is sent to the camera. To create a capture request, we can use one of the predefined templates (or we can use TEMPLATE_MANUAL for full control). Once we have chosen a template, we need to provide one or more target output buffers to be used with the request. We can only use buffers that were already defined on the capture session we intend to use.

Capture requests use a builder pattern and give developers the opportunity to set many different options including auto-exposure, auto-focus and lens aperture. Before setting a field, make sure that the specific option is available for the device by calling CameraCharacteristics.getAvailableCaptureRequestKeys() and that the desired value is supported by checking the appropriate camera characteristic — for example, available auto-exposure modes.

To create a simple capture request for our SurfaceView using the template designed for video preview without any modifications, use CameraDevice.TEMPLATE_PREVIEW:

Armed with a capture request, we can finally dispatch it to the camera session:

When an output frame is put into the target buffer(s), a capture callback is triggered. In many cases additional callbacks may be triggered as well once the frame it contains has been processed, for example ImageReader.OnImageAvailableListener. It is at this point that we can retrieve image data out of the target buffer.

Repeating CaptureRequests

One-shot camera requests are easy to do, but for displaying a live preview they aren’t very useful. In that case, we would like to receive a continuous stream of frames, not just a single one. Luckily, there is a way to set a repeating request to the session:

A repeating capture request will make the camera device continually capture images using the settings in the provided CaptureRequest, at the maximum rate possible.

Interleaving CaptureRequests

To make things a bit more complicated and closer to a real-world scenario… What if we wanted to send a second capture request while the repeating capture request is active? That’s what an app needs to do to display a viewfinder and let users capture a photo. In that case, we don’t need to stop the ongoing repeating request, we can simply issue a non-repeating capture request — just remember that any output target buffer being used needs to be configured as part of the camera session when the session is first created. It is important to note that repeating requests have lower priority than one-shot or burst requests, which lets us do something like this:

However, there is a big gotcha with this approach: we don’t know exactly when the single request will occur. In the figure below, if A is the repeating capture request and B is the one-shot capture request, this is how the session would process the request queue:

Illustration of a request queue for the ongoing camera session

There are no guarantees for the latency between the last repeating request from A before request B kicks in and the next time that A is being used again, so we may experience some skipped frames. There are a few things we can do to mitigate this problem:

Add the output targets from request A also to request B. That way, when B’s frame is ready, it will be copied into A’s output targets. This is essential, for example, when doing video snapshots to maintain a steady frame rate. In the code above, we would simply add singleRequest.addTarget(previewSurface) before building the request.
Use a combination of templates designed to work for this particular scenario, such as zero-shutter-lag (often abbreviated to ZSL).
If the configuration of the single capture request is the same as the repeating request, there is no need for multiple requests at all. Set a single repeating request to output to both the target of the repeating request as well as the target of the single request. Even though they must have the same capture request settings, they can still be of different sizes and formats.

Lesson learned

It’s important to understand camera sessions and capture requests when creating applications that use the Android camera API. In this blog post, we have covered:

How camera sessions work
How capture requests work, including one-shot and repeating capture requests
How capture requests are added to a queue, and different techniques for dispatching them without disrupting ongoing streams

In a future post, we will learn how to handle the complexities of requests with more than one target as part of one session and how to configure the camera pipelines to get the most out of a device’s capabilities.