Tianyu Lang & Nick DeChant| Pinterest engineers, Video & Image Platform
Pinterest is one of the most image-heavy services online, and so it’s crucial that we constantly work to improve the speed and quality of those images, whether static, GIF or video. As part of the video and image platform team’s work, we uncovered that converting GIFs to videos decreased load time, increased playback smoothness and reduced app crashes.
Identifying improvements for GIFs
A small team built support for GIFs in 2014 after a night of hacking at Makeathon. Over time we’ve worked to improve load times and use less memory to avoid crashes, which hurt the user experience and engagement.
GIFs are much larger than images and videos and therefore take longer to download. Additionally, the GIF library on our iOS app uses a large amount of memory to decode and display the animation. Because GIFs are large and need a memory-intensive library to decode from compressed format into raw pixels, our iOS app would run out of memory and crash.
To solve for this, we began converting GIFs to videos after building out our native video platform in 2016.
GIF vs. videos
The GIF format was created in 1987 as an efficient format to allow slow modems to download large images. Netscape Navigator 2.0 first supported animated looping GIFs in 1995. GIFs are animated very simply by flipping a series of individual images really fast. You can take a look at its Wikipedia page for more history.
For example, the Mario GIF above plays the following 3 images with 0.54 seconds in between to achieve the illusion of animation.
However, GIF’s simple animation support tends to store repetitive information and results in large files. For example, the following common area doesn’t change in any of the frames, but is still stored.
This shortcoming can be reduced by using the right “disposal method” at image creation time. The correct disposal method reduces file size by getting rid of repeated information, so only the first frame needs to store the “common area” and frame 2 and 3 only store the difference. However, not many GIF creation softwares take advantage of this optimization.
H.264 videos were first introduced in 2003, then improved with each version until 2014. Instead of storing individual images like GIFs, H.264 videos store full images (keyframes), and deltas (differential frames). Below is a simplified illustration of how frames are stored in H.264 videos.
Frame 1 is a “full” image while frame 2 and 3 are deltas with respect to the full image. If there’s a drastic change on frame 4, then frame 4 would be a new “full” image.
For our purposes, we compared GIFs and videos along three characteristics that affect load speed and video smoothness–streaming, adaptive bitrate and size.
Streaming affects how long users wait before the playback starts. Better streams reduce perceived load time. If content isn’t streamed, it needs to be downloaded in its entirety. Think of the difference between waiting for a DVD from Netflix to arrive in the mail versus streaming it on your laptop or phone.
Both GIFs and H.264 videos can be streamed, however, H.264 videos can use the more recently developed HLS (HTTP Live Streaming) technology by Apple. HLS was first introduced in 2009 and is now supported natively on web, Android and iOS. With HLS, a video needs to first break into sequential chunks, and then it can be downloaded and played chunk by chunk on the playback device. HLS playlists are fetched at the beginning of the streaming session. They guide the playback device by providing it the location of each video segment. The diagram below demonstrates the video playback process with HLS.
These video segments also allow the video to be cached more efficiently in our CDN. If only a part of the video is heavily accessed, only the corresponding segments need to be cached. For GIFs, however, the entire file must be cached.
Adaptive Bitrate Streaming
Adaptive bitrate streaming is a technique to enhance the user experience by degrading the quality of the media based on network condition. It improves the smoothness of video playback by switching to a lower quality version of the video if network bandwidth suddenly drops, and vise versa.
Since HLS videos are already in chunks, adaptive bitrate streaming comes almost for free. We only need to generate variants of different bitrates for the same video, then switch between them based on the network conditions. The following diagram is a simplified illustration of how HLS implements adaptive bitrate streaming.
Adaptive bitrate streaming is achievable for GIFs. We extract the images inside a GIF, generate alternatives of different qualities and create a GIF player that plays it using adaptive techniques. However, this requires a lot of work and you need to build the GIF player on your own. In contrast, all native video players support adaptive bitrate videos and will automatically switch quality/bitrate for the best experience.
File size impacts load speed in the most direct way. The larger the file, the longer the downloading time.
H.264 videos are much smaller than their GIF counterparts, which we confirmed with an experiment. We randomly picked 10,000 animated GIFs from 3.1KB to 78MB and converted them to H.264 videos. We used FFmpeg 3.2 and the following command for the conversion.
ffmpeg -i $gifPath -movflags faststart -pix_fmt yuv420p -vf “scale=trunc(iw/2)*2:trunc(ih/2)*2” -c:v libx264 $videoPath
The following graph demonstrates the distribution of the size reduction after transcoding these 10,000 GIFs to videos. The larger the ratio, the better the result.
About 93 percent of these GIFs reduced at least half of their sizes with their video counterparts. The video counterparts of almost 50 percent of the GIFs are only one-eighth the size of the original. Additionally, there’s negligible quality loss in the process. Below is a side-by-side comparison of a GIF with original size of 4.1 MB and its video counterpart of 1.4 MB.
After these comparisons between GIFs and videos, we were confident in our decision to convert GIFs to videos. Let’s see the power of videos in action. We played the same three GIFs on the same device under the same condition twice, once for GIFs and once for videos. The videos ran smoothly and finished in 26 seconds while the GIFs lagged and finished in 83 seconds. Below is comparison:
Although the results are great, there’s a catch. GIFs aren’t always larger than videos. If the GIF has huge delays between frames, the converted video will be significantly larger than the original GIF. For example, the GIF below only has 162 frames and is 6.6 MB, yet its video counterpart is over 130 MB. Between each frame of this GIF, there’s a 655.35 second delay. That’s more than 10 minutes! By default, FFmpeg creates a “full” image (keyframe) every several seconds. So this GIF will result in many more keyframes than necessary.
We’re currently looking into building a solution to handle these types of images. Keep an eye out for future updates and enjoy improved GIF experience on Pinterest!
Acknowledgements: The main contributors to the GIF->video transcoding project include Tianyu Lang, Nick DeChant, Rui Zhang and Norbert Potocki from the Video & Image Platform team.