dav1d 0.2.0: Covering all PC’s, including mobile

Ewout ter Hoeven
Mar 4, 2019 · 5 min read

Three months after dav1d 0.1.0 “Gazelle” was released version 0.2.0 just got tagged. Under code name “Antelope” huge improvements were made to the AV1 decoder for older PC’s and mobile devices, on 8-bit content. By hand-writing SSSE3 and NEON assembly code, most of the C functions were sped up by factors ranging anywhere from 2 to 20, resulting in hugely higher frame rates.

This blog will provide an overview of dav1d’s performance, compared to both the 0.1.0 release and the AV1 reference decoder, aomdec.

PC: SSSE3 for older x86 CPUs

Since different videos use different functions of the AV1 codec in different proportions, some saw larger increases than others. Below are the results for three 1080p videos, comparing dav1d at 0.1.0 release to the current head.

Image for post
Image for post

With both single-threaded and multi-threaded the improvements are huge, averaging around 2,25x for ST and 2,5x for MT. Looking at raw frame rates, this means on almost any device with SSSE3 1080p at 30fps is playable without a hitch, while quad-core high frequency processors should also be able to handle up to 1440p at 60fps and 2160p at 30fps.

The following results were reached on a Intel Core i5-4590 (Haswell, 4c/4t, 3,5 GHz) using only SSSE3 instructions:

Image for post
Image for post

If we normalize the values we can closer examine the gains, averaging around 2,23x:

Image for post
Image for post

On average, dav1d 0.2.0 is 2,23x faster on 8-bit content than 0.1.0.

x86 performance compared to aomdec

All the numbers below are for 8-bit color depths with 4:2:0 chroma subsampling. For multi-thread aomdec used 4 threads, while dav1d used 8 framethreads and 4 tilethreads. Both give optimal performance on a quad-core CPU.

Comparing SSSE3 performance, with a single-thread dav1d and aomdec perform about the same. Multi-threaded dav1d is 2,5 to 3 times faster.

Image for post
Image for post
Image for post
Image for post

Moving on to CPUs that can handle SSE4.1 instructions (which is 95,82% according to Steam) aomdec claims a small lead in single-threaded performance. dav1d doesn’t have separate assembly code for SSE4.1, so performance is (for now) identical to SSSE3 CPUs. Multi-threaded dav1d is still about twice as fast.

Image for post
Image for post
Image for post
Image for post

AVX2 performance increased a very slight 1% to 2% for dav1d, which was already very fast. Single-thread enjoys a comfortable 40% leap, and with multiple threads anywhere from 2,5x to 5x.

Image for post
Image for post
Image for post
Image for post

Mobile: NEON

Starting with Arm64 (Aarch64) performance, we see an average 38% improvement for single-thread and a 53% improvement for multi-thread. On a Snapdragon 835, the improvement enables 1080p at 60fps for most videos.

Image for post
Image for post
Image for post
Image for post

32-bit Arm (Armv7) also improved a lot, since most assembly code can be fairly easily ported between the two. Single-thread saw a spectacular average speedup of 62% while multi-thread increased by 46%. 1080p at 30fps should be fluent on most CPUs with at least two ‘big’ cores.

Image for post
Image for post
Image for post
Image for post

Conclusion

There are still some functions left to write SSSE3 assembly for, as is the case for NEON. So in future releases we will see dav1d get even faster on those platforms, but in the meantime it’s more than fast enough to provide a proper 1080p experience on most devices.

This is all on 8-bit content however, which is still the vast majority on most platforms. 10-bit, and later 12-bit, there isn’t assembly code in dav1d yet, that will be something to look forward to.

VLC wil very soon release a new stable release with dav1d 0.2.0, and Firefox is also working on integrating support. FFmpeg already uses dav1d for decoding in the development branch, and Handbrake will support it soon.

Thanks

Materials

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store