As low as 100µs delay can be reached with stock Raspberry Pi and no custom hardware.
Examples use-cases for synchronized video streams are:
- image stitching for panoramic photo/video, or image blending for example: HDR or low light imagery
- stereovision for 3D reconstruction and depth sensing applications
- multi-camera tracking for example: eye tracking with one camera dedicated to each eye
Why is synchronization so important?
Let’s see an early stitching test with free running cameras.
Each camera is capturing at 90fps, hence there is about
11ms between each consecutive frame in the stream.
In the example above, if the right camera is used as reference time, the center camera is almost 1ms late and the left camera is more than 5ms late.
During that delay, the car had time to move further from left to right.
As you can see, a single millisecond is already enough for the car to look “chuncky” in the stitched image.
How to synchronize cameras?
Many smartphone SoCs have 2–3 camera interfaces nowadays and can capture on multiple cameras simultaneously. Also available is specific hardware to merge (aka multiplex) several cameras into a single stream (like the ArduCam “CamArray”).
It is also achievable using simple Raspberry Pi and camera hardware with synchronization over Ethernet network. This solution is not as precise as direct camera connections, but it avoids custom hardware and has the advantage of scaling to many more cameras (in theory there is no limit).
For instance, see a demo of 360° video with 8 Raspberry Pi and camera using network-sync’d camera capture.
The last and maybe main benefit of network synchronization is the possibility to have much larger distances between the cameras. For instance in a car where cameras can be located in the left and right rear-view mirrors.
Building a test setup
Three cameras are controlled by three independent Raspberry Pi 3 boards, connected together over Ethernet via a mini switch.
The network is used to exchange clock synchronization messages.
Does it really work well?
The first question should rather be: “How to test it?”
An idea is to display a stitched synchronized stream of a common running stopwatch recorded by all cameras. If the different cameras do not capture the same watch time image, they are not capturing in sync.
Note: sorry for the mirrored image on the left, it’s a configuration mistake.
The image was captured at 17 seconds and 86 milliseconds. All three cameras show the same time, but keep in mind some limitations of this experiment:
- The computer is rendering the stopwatch at 60fps, therefore the video output only refreshes every ~16ms. In other words, it cannot update every millisecond.
- The screen pixels themselves need a few milliseconds to change from black to white.
According to the stitcher statistics (white numbers in the bottom), the maximum delay between cameras barely exceeds 1ms during the video.
What is the output latency?
Latency is the time needed from an image captured at the camera to a stitched/synchronized output on the screen.
The little HMDI display attached over the larger display shows the live stitcher output (i.e., the synchronized output of the 3 camera).
Latency is easy to find out in this screenshot as it is
200ms — 13ms = 187ms.
Synchronized video streams with Raspberry Pi could be used for many use-cases (e.g., stitching, 3D sensing, tracking, …) before investing in an expensive video multiplexer hardware.
Find the open-source modification of the RPi camera tool at:
How to reduce delay?
For the experiment presented here, the camera were recording at 30fps.
When capturing at higher framerate, synchronization improves (at the cost of decreasing video quality).
Raspberry Pi camera v2 can capture up to 200fps, but the weak video encoder of the Raspberry Pi GPU cannot follow this pace. At 100fps, the quality is still acceptable and synchronization delay is around 100µs.
Since delay also depends on network clock synchronization, better PTP settings and hardware support for timestamping in the Ethernet interface would help.
How to reduce latency?
Latency depends on each steps of the stitching pipeline:
- [camera board] capture frame
- [camera board] encode frame
- [Ethernet] network delivery
- [stitcher board] frame alignment
- [stitcher board] frame decoding
- [stitcher board] stitched frame rendering
- [display] display
In this experiment, many things could be do to improve latency:
- Using gigabit Ethernet, a frame would be delivered faster.
- Using AVB Ethernet, a frame could be delivered just in the right time for rendering. No wait in a buffer for the next render loop and no frame alignment needed.
- Using hardware decoding and hardware copy, a frame would be available in the GPU memory without any CPU copy.
- Using a high framerate display (e.g., 144Hz and more), rendered output would be pushed with less delay to the next screen refresh.