Pre-caching adaptive video stream in a playlist with ExoPlayer API

Tai Than
5 min readAug 23, 2021

--

Recently, I have been working on improving video playlist performance for my company Android project. The playlist is a video list that can be scrolled back and forth to change the currently focused video for playing. One of the things I care about is the long wait time after switching to a new video, before it is playable. Video preparing time can be reduced significantly by applying pre-caching mechanism on the videos that are going to be played. However, my company project uses HLS for video streaming, which make it a little bit more complicated to cache video stream. After a while browsing on the Internet, I finally am able to implement it, and now I want to share my implementation with you guys.

Analysis

The basic idea of pre-caching is to download the beginning of the video that is going to be played into a cache. When the user scrolls to that video, the player will detect that the requested data has already existed in the cache and use it for playback rather than obtaining the data from network. While playing, the player will continuously check to load subsequent video segments from cache and once reaching the end of the cached part, it will seamlessly switch to use network source.

In the case of adaptive video stream, it is more complicated to implement pre-caching as there are many tracks (options) to choose in each group (video resolution, audio, subtitle, etc.). Pre-caching the beginning of every track is quite an overkill, as the cache size would be unnecessarily huge. Instead, we should only pre-cache a minimal number of tracks which is enough for the video to be playable, i.e. only one track per group. However, this could introduce a different issue that when the player starts preparing the video, it may not choose the exact tracks which has been pre-cached, and may start downloading the tracks that it prefers from the network (check this ExoPlayer github issue for more details). To address this issue, we can specify specific tracks to pre-cache, then force the player to choose these tracks when it starts preparing the video for playback. This approach may lead to losing the benefit of using adaptive stream as we prevent the player from dynamically switching between available tracks based on its conditions and constraints, so take it into your consideration. In my case, it’s not that necessary to change the playback option on-the-fly, so it’s fine to apply.

Implementation

  1. Construct CacheDataSourceFactory instance

The metadata for the video to be prepared is passed to the player via a MediaSource instance (or MediaItem, if using newer ExoPlayer versions), which can be constructed using a DataSource.Factory. Normally, we would simply use a DefaultDataSourceFactory to construct a MediaSource, which makes the player download the video stream from the network and no caching is applied by default. To construct a data source that can utilize cache, we need to use a CacheDataSourceFactory instead.

To construct a CacheDataSourceFactory, we need to build several parameters including:

  • cache: a Cache instance, an in-memory representation of the storage cache. To construct a Cache, we need to provide the path to its location, a CacheEvictor and a DatabaseProvider. The File path should be a dedicated directory for ExoPlayer, as it may delete any unrecognized file in the directory. The CacheEvictor is responsible for evicting content from the cache based on some strategies. In our case we should use a LeastRecentlyUsedCacheEvictor as the cached video stream doesn’t need to stay in the cache permanently. If you prefer otherwise, you can use a NoOpCacheEvictor instead. The third parameter, DatabaseProvider is for storing the metadata of the cached video, so that they can be managed by the Cache. An important note is that there should be only one Cache instance operating on a particular directory at a time, since it will lock the directory after being constructed, and further Cache instance built with the same directory (before calling cache.release() on the previous Cache) will throw a RuntimeException.
  • upstreamDataSourceFactory: a DataSource.Factory for obtaining video content from the upstream (network) source, if not found in cache. For simplicity we can use a DefaultDataSourceFactory.
  • The next 3 parameters are cacheReadDataSourceFactory, cacheWriteDataSinkFactory and flags, which are used for reading, writing cache and customizing cache usage strategy respectively. We don’t need much customization on these parameters so I’m gonna use the default manners to construct them. The last parameter is the eventListener, used for debugging the cache.

2. Build a MediaSource for playing

The constructed CacheDataSourceFactory instance, together with the video stream url will be used to create a new MediaSource each time it is required (usually one MediaSource per video)

When creating a new MediaSource, it’s vitally important to set the stream keys for the MediaSource.Factory. The stream keys are used to filter the adaptive video stream’s manifest, so that the tracks to be prepared are fixed to the tracks that we are going to pre-cache. This will also prevent the player from switching to the tracks that are not filtered by the stream keys while playing, as long as it still use this MediaSource instance. The StreamKey first constructor parameter is the group index (index of the resolution, audio, caption, etc. group), and the second one is the track index (the option within each group). The example above assumed that the adaptive video stream has at least 5 groups and 2 tracks per group. By building the cacheStreamKeys as above, we will be caching the track at index 1 (the second track) of each group, for totally 5 groups. Calling setAllowChunklessPreparation() is for enabling chunkless preparation if possible, when the master playlist (manifest) provides sufficient information, to reduce the preparation time.

3. Pre-cache the video stream

To pre-cache the video stream, we will be using the APIs provided by the ExoPlayer library itself. First we need to construct a Downloader used for downloading the video content into the cache. The cache, upstreamDataSourceFactory and cacheDataSourceFactory parameters should be the same instances used for creating the MediaSources.

To pre-cache the video stream, simply call downloader.download() on a worker thread and cancel it when the size of the pre-cached part is large enough. In the example below, I use kotlin coroutines for running asynchronous downloading operation in the background.

This preCacheVideo() method should be invoked for pre-caching the next video in the list, when the user has scrolled to a newly focused one. In addition, pre-caching should be cancelled when the user scrolls to the targeted video, as the ExoPlayer will be taking care of downloading more segments while playing this video, and also to prevent potential conflict when both agents may perform write operations to the same cache simultaneously.

For a detailed implementation, check the source code below:

Thanks for reading my very first story, please give a clap if you find it useful. Subscribe for more performance improvement tips and tricks on Android app development. The next story will be about improving scrolling performance in a ViewPager with layout-complicated Fragments.

Happy coding :)

--

--