Working With CODECs And Understanding Bit Depth And Bit Rate In Video
Working with video during the post process (e.g. editing) requires understanding the relationship between bit rate (also spelled bitrate) and its CODEC. Bit rate refers to the amount of digital data encoded per unit of time, usually expressed in seconds to minutes. When preparing to shoot video during production, taking bit rate into consideration is important to determine file size and image quality. This is in relation to the CODEC, which is the compression/decompression hardware (or software) that helps determine the quality of the images in your content (i.e. video) for delivery.
We are going to discuss CODEC in terms of video, since there are other types used for different purposes (e.g. JPEG for still images and MP3 for audio). The CODEC is often included as a camera feature. An example of this would be the H.264 CODEC used in Canon 5D Mark II and Mark III cameras. The CODEC itself resides inside the camera’s firmware chip, and performs its function without requiring user intervention. Another type of CODEC can be implemented in software (intermediate CODEC) like Apple’s ProRes, used in post editing (used for final format delivery). CODECs work behind the scenes to process the captured image using programmed algorithms.
The CODEC helps determine the quality of the captured content, but we need to know the bit depth and bit rate to have a better understanding of file size for storage considerations. We will begin with the CODEC’s function, and then delve into calculating the estimated storage from the bit rate manually to have a better idea of the content’s image quality and file size.
The CODEC
The CODEC’s main function is to convert the content’s original file to a smaller format for content distribution or media. This is because the original file is usually in an uncompressed format that cannot be easily viewed or distributed on different devices. The file is too large for content distribution, so the CODEC compresses the file to a smaller size. The CODEC compresses the digital content from the source and then allows users to decompress the content for viewing. The CODEC also performs processing the color and details information to present the final quality that users see on their display.
When capturing an image, an important consideration is the bit depth your CODEC in your camera is capable of. The bit depth determines the amount of color that the camera can recognize and encode. Each pixel contains 3 channels that represent the colors RGB (Red, Green, Blue). The colors are encoded to create one pixel that contains the amount of RGB from each channel to represent the color. Color is just the blending of the RGB channels, represented as data during the encoding process.
If the CODEC is specified as 12-bit, that means that each RGB channel contains 12 bits of color data for a total of 36 bits.
12 bits per channel
3 total channels12 bits/channel * 3 channels = 36 bits
A 16-bit CODEC provides up to 65,536 ranges in color. A 12-bit CODEC provides up to 4,096 colors. A lower 8-bit CODEC can represent 256 colors.
16 bits2^16 = 65,53612 bits 2^12 = 4,0968 bits2^8 = 256
Higher bit depth is recommended because you have more colors to work with. Lower bit depth can lead to what is called banding, since there are fewer colors to process. This is due to not having enough tonal range to represent the colors in an image. Higher bit depths can better represent colors the best due to their wider range.
It may already be obvious that in order to have higher bit depth, you will need more storage space in your digital media (e.g. SD card) since the amount of data being stored will require larger file sizes. The purpose of the CODEC is to manage the compression and decompression of a captured image to be encoded in a suitable format. The general rule is the less compression (uncompressed), the better the image quality and more storage space required. More compression leads to a degradation of the image quality, but requires less space for storage.
When the CODEC does its work, it uses a technique called chroma sub-sampling. The sampling is used on the pixel level, and is indicated as x:x:x. If the chroma sub-sampling is 4:4:4, that means that in a group of 4 pixels all the information is kept with no color information lost. It requires more space and less compression, but it provides the best overall quality. In a 4:2:2 sampling, only 2 out of the 4 pixel color information is saved. This is in order to save more space for storage at the best quality possible.
What we have discussed so far deals with the capture CODEC from the camera. Once the raw footage from camera has been exported to a computer, the next type is the edit CODEC. This is the CODEC used before the final format is delivered. We deal with two types of compression when editing.
The first is called intra frame compression, which performs compression within each frame. Each frame is decoded without interdependency with other frames. The second type is called inter frame compression. While in intra frame there is no dependency on another frame, with inter frame there is. Inter frame compression requires information using a predictive algorithm from neighboring frames in order to be processed.
Since intra frame is focused on just a single frame, it is much faster to work with. Inter frame can take up much more time and space since it needs to gather data from other frames before creating the next one. It is really up to the editor and the type of software they are using. CODECs in post like ProRes (lossy compression) use intra frame compression techniques. With regards to the timeline, intra frame is not only faster but much better at random access performance offering the highest quality with the available disk storage system.
The Bit Rate
As mentioned earlier, bit rate has to do with the encoding of data. The higher the bit rate, in bits per second, the higher the quality of the video. This also leads to larger file sizes due to that amount of data that is being encoded. Let us take an example of this.
If you use a bit rate of 1 Mbps (Megabits per second), let’s say that the estimated file size for your video is 7 MB of disk space. Now let us say that we increase the bit rate to 8 Mbps, then the file size increases by 40x to 40 MB of disk space. Assuming we have a resolution of 1920 x 1080 or 1080p, the video that was rendered at a higher bit rate will look much better side by side. This is why even though you have different devices that shoot at the same resolution, they can have different results due to bit rate. The device that uses a higher bit rate often has the best quality (using only resolution and not other factors).
While there is a bit rate for encoding the data to storage, there is also a bit rate for streaming data across the network. For streaming content, the average video bit rate is 2.5 Mbps for HD resolution (720p) and between 5 to 8 Mbps for Full HD resolution (1080p). For higher resolution content in 4K (UHD), that would require 20 Mbps (based on Google’s recommendation). Size does not matter as much as the bit rate on the network. Higher bit rates provide better quality in the case of HD and UHD content because more data is being processed per second.
There are two types of bit rate to know when encoding a file.
- Constant Bit Rate (CBR)— Uses a consistent bit rate throughout the export of the final video content. You just specify in your editing software the bit rate to use and the CODEC will perform the amount of compression necessary.
- Variable Bit Rate (VBR) — Uses different bit rates through the output of the content. It is more efficient because the CODEC applies compression based on the content output at a segment of time. If for example no compression is needed, the bit rate is varied to adjust to the output. It allocates less space to segments that don’t require data encoding (e.g. blank background in frame). The variable bit rates are then computed to give an average bit rate.
Between the types of bit rate, VBR can be much better at saving disk space compared to CBR since it doesn’t always use the target bit rate throughout the output of the file. In terms of streaming content though, CBR might be more useful since it can make use of the target bit rate consistently. CBR makes use of the available capacity on limited network bandwidth. With VBR for streaming it tends to use higher bit rates when processing complex segments, which could be a challenge for capacity on networks with less bandwidth.
Calculating the file size gives an idea of the storage requirement. We can do this manually using a common formula that has three main parameters:
- File Size — Measured in Bytes (B) (1 Byte = 8 bits)
- Bit Rate — Measured in Bytes per second (Bps) (8 bits/second = 1 Byte/second)
- Total Time (Length of Time) — Measured in minutes (min) from seconds (sec)
Let’s say we have 3 hours (180 minutes) of footage that we need to process. Supposed we are going to export the content at a rate of 45 Mbps. First we need to convert Mbps to MBps (Megabits/sec to Megabytes/sec), from bits to Bytes. To do this we must divide 45 Mbps by 8 bits.
Mbps to MBpsMBps = (45,000,000 bits/sec) / (8 bits/1 Byte) MBps = 5,625,000 Bytes/sec / (1,000,000 Bytes / 1 MB)MBps = 5.625 MBps
The bit rate is 5.625 MBps (notice the capital B in MBps). Next we convert the seconds to minutes.
Seconds to MinutesMBpm = (5.525 MB/sec) / (1 min / 60 sec)MBpm = 331.5 MB/min
Finally, we can get the estimated storage by multiplying the bit rate with the total time of our footage.
Total Time = 180 min
Bit Rate = 331.5 MB/minStorage = 331.5 MB/min * 180 minStorage = 59,670 MB
The storage required would be about 5.9 GB. This is how much disk space needs to be used for 180 minutes of video footage at a bit rate of 45 Mbps.
When you know the estimated file size that will be created from a shoot, you can plan for capacity. If the estimated file size is 5.9 GB, then the storage device should be at minimum 8 GB. The extra storage is some slack in case more footage needs to be captured.
Final Deliverables
Regardless of which technique you use, the size of the file can be very large. The purpose of non-linear editing software (e.g. Adobe Premier, Final Cut Pro, DaVinci Resolve) is to be able to work with the original file without modifying the actual content. Instead, the editor can use a compressed copy of the original file and create versions of it called proxies. You render this as you create the final deliverable to media. This is also called non-destructive editing.
The reason you apply a CODEC to the final content is to improve speed when streaming. Uncompressed video will not only eat up bandwidth, but could cost more per data streamed for users on a mobile or Internet plan (limited data downloads). Bandwidth is also finite and existing networks cannot support such high throughput from uncompressed formats. That can also be very expensive if the size of the uncompressed file were in the Terabytes. That is why compression techniques are used to make the file size smaller, while not losing as much quality.
We make use of the deliverable CODEC to finalize the content. This ensures that the content will play back smoothly and display the frames properly. The deliverable CODEC exports to the final content. The resolution and frame rate are settings that are based on your camera’s specifications, and cannot really be modified to make the quality any better. If the camera captured the video at 1080p, then that is the best quality. You can upscale it higher, but it will not be any better.
If the content will be uploaded to platforms like Vimeo or YouTube, it is best to compress at the highest bit rate possible. This is because video streaming platforms will further compress your content and that can lead to a loss of quality. In order to preserve as much of the quality that you have left, use a higher bit rate when exporting.
Once ready, it can be deployed using a transcoder which converts the content to a format that can be viewed on specific devices (e.g. smartphone, tablet, laptop, desktop). The content must first be sent to a CDN (Content Delivery Network) where it will be stored and then it can be transcoded and streamed to a user’s device. A common format that is universal on many devices is MP4, supported by Apple Quicktime, VLC and Windows Media Player. On streaming platforms like YouTube, the common CODEC used is VP9 and H.264 (on some videos) using dynamic adaptive streaming over HTTP.
Finding CODEC Information In A File
If you want to learn more about the CODEC information used in a file, you can use video playback software like VLC. Open your file in VLC and go to the Window -> Media Information menu.
You will then see the following Media Information window pop up.
Look at the top part or (Stream 0) which provides video information (the bottom part or Stream 1 provides audio information). In this example you can see that the CODEC used was H.264.
Synopsis
When shooting high quality content, entry level cameras are not ideal for production level deliverables. Higher end prosumer DSLR or HD video cameras are normally used. The storage media in this case is not a typical SD card, but faster cards like CompactFlash. Not only are they fast and durable, they also have capacities to support larger file sizes.
Using a CODEC will help determine the overall quality of the content and disk storage requirements. The bit rate is an important indicator used. From the bit rate we can determine the file size of the exported output. Higher quality content will have larger file sizes with the best compression applied by the CODEC. Smaller sized content tend to have more compression that compromises quality.
A CODEC is necessary for optimizing storage, improving streaming speed and encoding details in bit depth (color information). There are different types of CODECS used in a digital workflow but they all perform a basic function which is compressing and decoding content data. Without a CODEC, streaming content will take longer and probably not become commercially viable as a service.