Unlock transparency in videos on Android

Using Jetpack Compose, ExoPlayer, OpenGL and TextureView!

Tristan Ferré
Electra
6 min readDec 12, 2022

--

Introduction

Currently (early 2022), there exists video codecs that support the use of an alpha channel to include transparency. It is the case for vp8 and vp9 codecs for instance that are used in the webm format. Unfortunately even if android now supports such codecs, alpha channel is not decoded.

Fig.1 Original video on a white background in a web browser (webm, vp9, rgb+a)
Fig.2 Render of the same video using Android’s VideoView

On Android, the R, G and B channels are well decoded but the A channel is not taken into account. The same happens using any video library (like exoplayer) so it seems like there is no easy solution to this problem.

Existing solutions

Displaying videos with transparent pixels

There is an existing (but rather old) library that tries to solve this issue using a technique called color-keying : https://github.com/pavelsemak/alpha-movie

The technique consists of choosing a color in the video that will be translated into alpha. Usually we choose green to be keyed out to alpha as in green screened videos.

Fig.3 A green screen being replaced by a whole setup. And dinos.

The drawbacks of color keying :

  • we cannot have items of the same color that will be keyed out in the video
  • we are dependent on the algorithm that will determine the alpha level according to the color level
  • this algorithm is not really good when we want semi-transparent objects

But this library is very nice in the way that it replaces pixels live using shaders of opengl. This is the main source of inspiration for the solution I decided to present here.

Usage of a mask

An iOS dev wrote an article about the problem where he uses a video composed of two concatenated videos :

  • The R+G+B channels on top
  • The A channel translated in gray levels on the bottom
Fig.4 Extract of a video of a play-doh bat with rgb on top and alpha mask on the bottom

Our solution

We choose to go this way :

  1. Use a video composed of the rgb channel on top and the alpha channel on the bottom in gray levels
  2. Play this video using ExoPlayer
  3. Output ExoPlayer into a SurfaceTexture
  4. Use an OpenGL shader to merge the data of the SurfaceTexture to output rgb+a pixels
  5. Render the shader output on a TextureView

The video input

Easy job, with absolutely no skill in adobe after effect, I succeeded in having a video in the required format in several minutes.

Fig.5 Composed video of rgb channels and alpha mask in gray levels

Creating an OpenGL TextureView

It seems that to display OpenGL outputs, the way to go would be a GLSurfaceView (cf. android docs). But such a view cannot have a transparent background. So even if the render in it has an alpha, the view will have a solid background color. Hopefully, there exists a type of view which documentation seems to match our expectations : TextureView.

It can “display a content stream, such as that coming from […] an OpenGL scene” and handles transparency like any other android View.

The major drawback : there is no official implementation nor clear example of how to plug it to opengl and only several examples on the web show people trying to copy paste some Google code that should do the trick. This sounds all but production proof and maintainable way of doing things (example of a frightening implementation).

So we decided to implement it ourselves using Jetpack Compose and Coroutines.

Fig.6 Our GLTexture implementation, a compose wrapper that plugs opengl to a TextureView

We dedicate an article about this implementation, feel free to look it up.

Writing a renderer and shaders

OpenGL needs a renderer to know how to draw on its output.

The renderer is quite identical to any renderer that would display a 2D textured square.

And the nice part comes in the shaders. The vertex shader computes the position of the rgb data and the alpha data on the texture given a position on the output surface :

Fig.7 The vertex shader’s code

We also have the transformation matrix of the surface in case the TextureView is rotated/scaled/etc as an input.

So we define the position in the texture of a given point in the output for rgb and alpha

Fig.8 Illustration of the position of each channel data given an output point

And then we apply the transformation matrix to the two positions. Those two positions are then passed to the fragment shader which will compute the output color vector.

Fig.9 The fragment shader’s code

We take the color vector of the input texture at the rgb position and the vector color at the alpha position. Then the output is a color vector in which rgb are taken from the rgb part of the rgb source and the alpha is taken from the green channel of the alpha source. We took the green channel but we could have also taken the red or blue one as well. Note that the r, g and b channels are multiplied by the value of alpha. We do this because android expects premultiplied alpha (cf. the first comment of this article).

Implementing the video player

After having framerate issues with Android’s MediaPlayer, we decided to use ExoPlayer as a video player. It has a constant frame rate and is quite powerful compared to its android equivalent. MediaPlayer had hard frame rate drops during the rendering, even if it was used without everything else (in a plain view).

Fig.10 Display time per frame for our 60fps video

And the implementation of ExoPlayer is quite straightforward :

Fig.11 Implementation of Exoplayer

Wiring everything all together

We eventually wrap the wiring in a composable function called TransparentVideo which passes our renderer implementation to the GLTexture composable function and wires it to our mediaplayer in its viewmodel.

Fig.12 TransparentVideo’s implementation
Fig.13 And its viewmodel’s implementation

Result

Aaaand voilà ! Now we can use our TransparentVideo composable anywhere in composable screens like so :

Fig.14 Usage of TransparentVideo

Here we load a raw resource which is a webm file of the format defined in the first sub-section encoded using the vp9 codec.

And the result is exactly as intended :

Fig.15 Actual render on a map

Discussions / drawbacks

We may ask ourselves if we should ban the use of videos for such usage for performance’s sake. It appears that it is quite lightweight and did not add so much power consumption during our tests. Moreover, we can make use of the wonderful vp9 codec to encode our videos which enables us to have very tiny asset files.

The only drawback I would outline is the maintainability of opengl related code since it is not very common to use those C-interop methods and 3D framework notions of fragment, vertex, shaders and stuff.

Example of implementation

In this repository we made an implementation of this solution by creating a library that can display videos with transparency as long as a sample app that uses it. Feel free to take a look!

Bibliography

Library trying to solve the problem using color-keying (inspiration for the usage of OpenGL) : https://github.com/pavelsemak/alpha-movie

Article about a solution on iOS using Metal (inspiration for the alpha mask) : https://medium.com/@quentinfasquel/ios-transparent-video-with-coreimage-52cfb2544d54

Android guide about OpenGL : https://developer.android.com/guide/topics/graphics/opengl

OpenGL guides : https://learnopengl.com/Advanced-OpenGL/Blending

OpenGL documentation : https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glBlendFunc.xhtml

--

--