Mastering Visual Brilliance: Crafting Your VOD Encoding Ladder with the Artistry of FFmpeg

Sandipan Mondal
Media Cloud Tech
Published in
12 min readMar 4, 2024

In this guide, we will explore the utilization of ffmpeg to craft an optimal encoding ladder for your Video on Demand (VOD) content, with a focus on the h264, h265 and VP9 codec.

You can download ffmpeg from the official website.

https://www.ffmpeg.org/download.html

Installation on mac:
To install FFmpeg on macOS, you can use Homebrew, which is a popular package manager for macOS. If you don’t have Homebrew installed, you can install it by following the instructions on the official Homebrew website: https://brew.sh/

Once you have Homebrew installed, open Terminal and run the following commands to install FFmpeg:

  1. Open Terminal (you can find it in the Applications > Utilities folder or search for it using Spotlight).

2. Install Homebrew (if you haven’t already) by pasting the following command and pressing Enter:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

3. After Homebrew is installed, run the following command to install FFmpeg:

brew install ffmpeg

4. Wait for the installation process to complete. Homebrew will download and install FFmpeg and its dependencies.

Once the installation is finished, you should have FFmpeg installed on your macOS system. You can test it by running the following command in Terminal:

ffmpeg -version

This should display information about the installed FFmpeg version.

That’s it! You’ve successfully installed FFmpeg on your Mac using Homebrew.

The installation process from windows and linux is mentioned here:

Windows:
https://www.redswitches.com/blog/install-ffmpeg-on-windows/#:~:text=Go%20to%20the%20official%20FFmpeg,dev.

Linux:
https://www.geeksforgeeks.org/how-to-install-ffmpeg-in-linux/

h264 multi bitrate encoding:

h264 or AVC:

H.264, also known as AVC, is a widely used video compression standard known for its high efficiency in compressing and transmitting high-definition video. It’s versatile, with broad industry adoption for applications like streaming, broadcasting, and Blu-ray. Despite newer standards, H.264 remains popular due to its compatibility and established infrastructure.

You can either utilize a script for your encoding tasks or execute them individually through the terminal. I am using shell script here.

#!/bin/bash
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v h264 -b:v 500k -maxrate 1000k -bufsize 1000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v h264 -b:v 800k -maxrate 1600k -bufsize 1600k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v h264 -b:v 1500k -maxrate 3000k -bufsize 3000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v h264 -b:v 3000k -maxrate 6000k -bufsize 6000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v h264 -b:v 500k -maxrate 1000k -bufsize 1000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 360p_%03d.ts 360p.m3u8 \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v h264 -b:v 800k -maxrate 1600k -bufsize 1600k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 540p_%03d.ts 540p.m3u8 \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v h264 -b:v 1500k -maxrate 3000k -bufsize 3000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 720p_%03d.ts 720p.m3u8 \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v h264 -b:v 3000k -maxrate 6000k -bufsize 6000k -profile:v main -sc_threshold 0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 1080p_%03d.ts 1080p.m3u8
# Create the master playlist manually
echo "#EXTM3U" > master.m3u8
echo "#EXT-X-VERSION:3" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=640x360" >> master.m3u8
echo "360p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=960x540" >> master.m3u8
echo "540p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=1500000,RESOLUTION=1280x720" >> master.m3u8
echo "720p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1920x1080" >> master.m3u8
echo "1080p.m3u8" >> master.m3u8

This Bash script utilizes FFmpeg to perform two-pass encoding for creating an HTTP Live Streaming (HLS) manifest and segment files at different resolutions. HLS is a widely used protocol for streaming video content over the internet. Adjust the resolution and bitrate as per your need.

Here’s a breakdown of the script:

First Pass:

The script runs FFmpeg with the input video file input.mp4.
It scales the video to different resolutions (640x360, 960x540, 1280x720, 1920x1080) using the scale filter.
It sets audio codec (aac), audio bitrate (64k), video codec (h264), and video bitrate (500k, 800k, 1500k, 3000k) for each resolution.
Additional parameters such as maxrate, bufsize, profile:v, sc_threshold, g, and keyint_min are also specified.
The -pass 1 option indicates the first pass of a two-pass process. The output is sent to /dev/null since it’s just a dummy run to collect statistics for the second pass.
Second Pass:

Similar to the first pass, it scales the video to different resolutions.
This time, it uses the -pass 2 option to perform the actual encoding.
It adds HLS-specific parameters (-hls_time, -hls_playlist_type, -hls_segment_filename) to generate the HLS output (*.m3u8 and segment files) for each resolution.
Master Playlist Creation:

After completing the second pass for all resolutions, the script manually creates a master playlist file (master.m3u8).
It specifies the different streams available in the master playlist, including the bandwidth and resolution for each stream.
Output Structure:

The script outputs HLS manifest and segment files for each resolution, named as 360p.m3u8, 540p.m3u8, 720p.m3u8, and 1080p.m3u8, along with corresponding segment files (360p_*.ts, 540p_*.ts, 720p_*.ts, 1080p_*.ts).
In summary, this script is a template for creating an HLS stream with multiple quality levels (resolutions) using FFmpeg’s two-pass encoding. It outputs a master playlist (master.m3u8) and individual playlist files for each resolution, along with the corresponding video segment files.

Encoding with h265 codec:

h265 or HEVC:

H.265, or High Efficiency Video Coding (HEVC), is a video compression standard that improves compression efficiency over its predecessor (H.264). Its importance lies in its ability to deliver higher video quality at lower bit rates, making it ideal for 4K and 8K content, reducing bandwidth requirements for streaming, saving storage space, and gaining widespread industry adoption.Now we will create a encoding ladder which will produce h265 as an output. Here I’ve used hls as output container you can use fragmented mp4 also.

#!/bin/bash
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 500k -maxrate 1000k -bufsize 1000k -g 48 -keyint_min 48 -x265-params open-gop=0 -pass 1 -f null /dev/null \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 800k -maxrate 1600k -bufsize 1600k -g 48 -keyint_min 48 -x265-params open-gop=0 -pass 1 -f null /dev/null \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 1500k -maxrate 3000k -bufsize 3000k -g 48 -keyint_min 48 -x265-params open-gop=0 -pass 1 -f null /dev/null \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 3000k -maxrate 6000k -bufsize 6000k -g 48 -keyint_min 48 -x265-params open-gop=0 -pass 1 -f null /dev/null
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 500k -maxrate 1000k -bufsize 1000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 360p_%03d.ts 360p.m3u8 \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 800k -maxrate 1600k -bufsize 1600k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 540p_%03d.ts 540p.m3u8 \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 1500k -maxrate 3000k -bufsize 3000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 720p_%03d.ts 720p.m3u8 \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 3000k -maxrate 6000k -bufsize 6000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -hls_time 6 -hls_playlist_type vod -hls_segment_filename 1080p_%03d.ts 1080p.m3u8
# Create the master playlist manually
echo "#EXTM3U" > master.m3u8
echo "#EXT-X-VERSION:3" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=640x360" >> master.m3u8
echo "360p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=960x540" >> master.m3u8
echo "540p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=1500000,RESOLUTION=1280x720" >> master.m3u8
echo "720p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1920x1080" >> master.m3u8
echo "1080p.m3u8" >> master.m3u8

This Bash script is using FFmpeg to perform video transcoding and create HTTP Live Streaming (HLS) playlists for different resolutions. Here’s an explanation of the script:

Video Transcoding (First Pass):

The script uses FFmpeg to perform a two-pass encoding for different resolutions.
The first pass is used to analyze the input video (input.mp4) and generate statistics. It doesn’t create an actual output video file. Instead, it redirects the output to /dev/null.
The resolutions being processed are 640x360, 960x540, 1280x720, and 1920x1080.
Various video and audio parameters are specified for each resolution, such as scale, codec, bitrate, and x265 parameters.
Video Transcoding (Second Pass — HLS):

After the first pass, the script uses the analyzed information to perform the actual transcoding and create HLS playlists.
For each resolution, a second pass is performed with the actual video output being generated in HLS format.
The output is segmented into multiple TS (Transport Stream) files with a specified duration (hls_time), and a master playlist (master.m3u8) is created.
The HLS playlists are named 360p.m3u8, 540p.m3u8, 720p.m3u8, and 1080p.m3u8, corresponding to the resolutions.
Master Playlist Creation:

The master playlist (master.m3u8) is manually created at the end of the script.
It includes information about the available streams (different resolutions) with their corresponding bandwidth and resolution details.
Explanation of the Master Playlist Content:

#EXTM3U: Indicates the start of the playlist file.
#EXT-X-VERSION:3: Specifies the version of the HLS protocol being used.
#EXT-X-STREAM-INF: Describes each individual stream with its bandwidth and resolution details.
The stream information is followed by the corresponding playlist file name (360p.m3u8, 540p.m3u8, etc.).
In summary, this script is a comprehensive tool for transcoding a video into multiple resolutions using the x265 codec and creating HLS playlists, along with a master playlist for adaptive streaming.

Generate fragmented mp4 output with h265:

Now I will produce fragmented mp4 by using h265 codec.

#!/bin/bash
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 500k -maxrate 1000k -bufsize 1000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 800k -maxrate 1600k -bufsize 1600k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 1500k -maxrate 3000k -bufsize 3000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 3000k -maxrate 6000k -bufsize 6000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 1 -f null /dev/null
ffmpeg -i input.mp4 \
-vf "scale=640:360" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 500k -maxrate 1000k -bufsize 1000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -movflags +frag_keyframe+empty_moov -hls_time 6 -hls_playlist_type vod -hls_segment_filename 360p_%03d.mp4 360p.m3u8 \
-vf "scale=960:540" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 800k -maxrate 1600k -bufsize 1600k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -movflags +frag_keyframe+empty_moov -hls_time 6 -hls_playlist_type vod -hls_segment_filename 540p_%03d.mp4 540p.m3u8 \
-vf "scale=1280:720" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 1500k -maxrate 3000k -bufsize 3000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -movflags +frag_keyframe+empty_moov -hls_time 6 -hls_playlist_type vod -hls_segment_filename 720p_%03d.mp4 720p.m3u8 \
-vf "scale=1920:1080" -c:a aac -b:a 64k -c:v libx265 -preset medium -b:v 3000k -maxrate 6000k -bufsize 6000k -x265-params open-gop=0 -g 48 -keyint_min 48 -pass 2 -movflags +frag_keyframe+empty_moov -hls_time 6 -hls_playlist_type vod -hls_segment_filename 1080p_%03d.mp4 1080p.m3u8
# Create the master playlist manually
echo "#EXTM3U" > master.m3u8
echo "#EXT-X-VERSION:3" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=500000,RESOLUTION=640x360" >> master.m3u8
echo "360p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=800000,RESOLUTION=960x540" >> master.m3u8
echo "540p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=1500000,RESOLUTION=1280x720" >> master.m3u8
echo "720p.m3u8" >> master.m3u8
echo "#EXT-X-STREAM-INF:BANDWIDTH=3000000,RESOLUTION=1920x1080" >> master.m3u8
echo "1080p.m3u8" >> master.m3u8

Encoding with vp9:

VP9:
VP9 is a video compression codec developed by Google. It is the successor to VP8 and is designed to provide better compression efficiency while maintaining good video quality. VP9 is an open and royalty-free video coding standard, making it widely adopted for online video streaming.
While VP9 has gained popularity, it faces competition from other video codecs like H.264 and H.265 (HEVC). The choice of codec often depends on factors such as licensing, compatibility, and specific requirements of the application or platform.

#!/bin/bash
# First pass
ffmpeg -i input.mp4 -vf "scale=640:360" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 500k -keyint_min 48 -g 48 -threads 8 -speed 4 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 1 -f webm /dev/null && \
ffmpeg -i input.mp4 -vf "scale=960:540" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 800k -keyint_min 48 -g 48 -threads 8 -speed 4 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 1 -f webm /dev/null && \
ffmpeg -i input.mp4 -vf "scale=1280:720" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 1500k -keyint_min 48 -g 48 -threads 8 -speed 4 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 1 -f webm /dev/null && \
ffmpeg -i input.mp4 -vf "scale=1920:1080" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 3000k -keyint_min 48 -g 48 -threads 8 -speed 4 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 1 -f webm /dev/null
# Second pass
ffmpeg -i input.mp4 -vf "scale=640:360" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 500k -maxrate 800k -keyint_min 48 -g 48 -threads 8 -speed 2 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 2 -f webm output_360p.webm && \
ffmpeg -i input.mp4 -vf "scale=960:540" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 800k -maxrate 1150k -keyint_min 48 -g 48 -threads 8 -speed 2 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 2 -f webm output_540p.webm && \
ffmpeg -i input.mp4 -vf "scale=1280:720" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 1500k -maxrate 1850k -keyint_min 48 -g 48 -threads 8 -speed 2 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 2 -f webm output_720p.webm && \
ffmpeg -i input.mp4 -vf "scale=1920:1080" -c:a libopus -b:a 64k -c:v libvpx-vp9 -b:v 3000k -maxrate 3250k -keyint_min 48 -g 48 -threads 8 -speed 2 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -frame-parallel 1 -pass 2 -f webm output_1080p.webm
# Create the master playlist manually
echo "<MPD xmlns='urn:mpeg:dash:schema:mpd:2011' profiles='urn:mpeg:dash:profile:isoff-on-demand:2011' type='static'>" > master.mpd
echo " <Period>" >> master.mpd
echo " <AdaptationSet mimeType='audio/webm' codecs='opus'>" >> master.mpd
echo " <SegmentTemplate media='audio_$Number$.opus' startNumber='1' initialization='audio_1.opus'>" >> master.mpd
echo " <SegmentTimeline>" >> master.mpd
echo " <S d='48000'/>" >> master.mpd
echo " </SegmentTimeline>" >> master.mpd
echo " </SegmentTemplate>" >> master.mpd
echo " </AdaptationSet>" >> master.mpd
echo " <AdaptationSet mimeType='video/webm' codecs='vp9'>" >> master.mpd
echo " <SegmentTemplate media='video_$Number$.webm' startNumber='1' initialization='video_1.webm'>" >> master.mpd
echo " <SegmentTimeline>" >> master.mpd
echo " <S d='48000'/>" >> master.mpd
echo " </SegmentTimeline>" >> master.mpd
echo " </SegmentTemplate>" >> master.mpd
echo " <Representation id='1' width='640' height='360' bandwidth='500000'/>" >> master.mpd
echo " <Representation id='2' width='960' height='540' bandwidth='800000'/>" >> master.mpd
echo " <Representation id='3' width='1280' height='720' bandwidth='1500000'/>" >> master.mpd
echo " <Representation id='4' width='1920' height='1080' bandwidth='3000000'/>" >> master.mpd
echo " </AdaptationSet>" >> master.mpd
echo " </Period>" >> master.mpd
echo "</MPD>" >> master.mpd

This script is a series of FFmpeg commands and manual MPD (Media Presentation Description) generation for creating a set of adaptive bitrate video streams using the VP9 video codec and Opus audio codec for the DASH (Dynamic Adaptive Streaming over HTTP) format.

Let’s break down the script:

First Pass

  1. Four FFmpeg commands:
  • Each command scales the input video (input.mp4) to a specific resolution (360p, 540p, 720p, 1080p).
  • The video and audio codecs used are VP9 and Opus, respectively.
  • Various encoding settings like bitrate, keyframe interval, threads, etc., are specified.
  • The output of each command is redirected to /dev/null since this is the first pass (no actual output files are created).

Second Pass

  1. Four more FFmpeg commands:
  • Similar to the first pass but with different max bitrate settings (-maxrate) for each resolution.
  • The output is directed to files named output_360p.webm, output_540p.webm, etc.

Master Playlist Creation

  1. Manually creates a DASH manifest (master.mpd):
  • Specifies the MPD XML structure with DASH profile and type.
  • Defines a period and two adaptation sets (audio and video).
  • For audio:
  • Uses Opus codec.
  • Defines a segment template for audio segments with a specific initialization and timeline.
  • For video:
  • Uses VP9 codec.
  • Defines a segment template for video segments with a specific initialization and timeline.
  • Specifies representations for different resolutions (640x360, 960x540, 1280x720, 1920x1080) with corresponding bandwidths.

Explanation of MPD Elements:

  • <MPD>: Main MPD element.
  • <Period>: Represents a period in the DASH presentation (usually corresponds to the entire content).
  • <AdaptationSet>: Contains settings for a particular type of media (audio or video).
  • <SegmentTemplate>: Defines how to construct media segment URLs.
  • <SegmentTimeline>: Specifies the timing information for each segment.
  • <Representation>: Describes a specific representation of the content with its attributes like resolution and bandwidth.

Note:

  • The script uses a two-pass encoding approach where the first pass is used for analysis, and the second pass is for actual encoding with bitrate control.
  • The DASH manifest (master.mpd) is manually created with specific settings for audio and video streams.

This script is useful for creating adaptive streaming content compatible with DASH players, allowing seamless quality adjustments based on the viewer’s internet speed and device capabilities.

--

--