Video Communication Course 2022 (Part2)

Nick Pai
5 min readJun 18, 2023

--

The following information represents the comprehensive notes from the ‘’Video Communication Course 2022 (Part2)’’. These notes provide the fundamental concepts, principles, and practical applications covered in the course. They serve as a valuable resource for students and professionals seeking to enhance their knowledge and skills in the field of video communication.

Part2 contains the following sections

  1. Image v.s. Video
  2. Analog convert to Digital
  3. Mbps v.s. MB/s
  4. RGB v.s. YUV
  5. YUV 4:4:4 v.s 4:2:2 v.s. 4:2:0
  6. JPEG
  7. MPEG
  8. Standards

Image v.s. Video

  • Image (combine by pixels, 由pixel組成)
  • Video (combine by images, a.k.a frame, 由多張image組成, 隨著時間軸Time, 每一張圖片也可以叫做frame幀)

A pixel can be stored in different bit sizes, for example, a pixel can be stored in 8-bit, 9-bi, 10-bit, or even 12-bit, …, etc. However, using more bits to represent 1 bit also means that a larger amount of data is required to represent 1 pixel. If the image resolution is higher (e.g. 10-bit 1920*1080), the amount of data generated is also very large. considerable

pixel可以用不同bit大小去儲存,例如1個pixel用8-bit、9-bi、10-bit、甚至12-bit…。但是使用越多的bit去表示1個bit,同時也代表表現1個pixel需要越大的資料量,如果圖片解析度越高(e.g. 10-bit 1920*1080),所產生的資料量也很可觀)

Analog convert to Digital

Ray is an analog data, how to convert it into digital data?

  • Gray Scale
    Assume today you want to convert to gray scale, and it is in 8-bit format (1 pixel = 8 bits). Then we convert the received light ray into a value between 0 and 255, so there are a total of 256 (²⁸) different values can represent the brightness of the grayscale.
  • RGB
    Assuming that we want to use color representation (RGB) today, it means that there will be a layer of R, G, and B, a total of 3 layers.
    — Use 8-bit format
    1 pixel can have ²⁸*²⁸*²⁸ = 16,777,216, about 17 million representations.
    — Use 10-bit format
    1 pixel can have ²¹⁰*²¹⁰*²¹⁰ = ²³⁰ = 1,073,741,824, about 1.1 billion representations.
    — Use 12-bit format
    1 pixel can have ²¹²*²¹²*²¹² = ²³⁶. A nearly astronomical representation.

    Therefore, the mainstream still uses the 8-bit format to meet most needs, avoiding 10-bit or 12-bit representation methods that require a very large amount of calculation and data.

光線ray是一種類比形式的資料,要如何轉換成數位的資料?

  • 灰階
    假設今天要轉換成灰階型態(Gray Scale),並且是8-bit的格式(1個pixel=8個bit),那我們就將接收到的光線ray轉換成0~255之間的數值,所以總共有256個種(²⁸種)不同的值可以表示灰階亮度。
  • 彩色
    假設今天要用彩色表示(RGB),表示會有R、G、B各一層,總共3層
    — 用8-bit格式
    1個pixel可以有 ²⁸*²⁸*²⁸ = 16,777,216,約1700萬種表示方法
    — 用10-bit格式
    1個pixel可以有 ²¹⁰*²¹⁰*²¹⁰ = ²³⁰ = 1,073,741,824,約11億種表示方法
    — 用12-bit格式
    1個pixel可以有 ²¹²*²¹²*²¹² = ²³⁶ 種近乎天文數字的表示方法

因此主流還是使用8-bit的格式就足以應付大多數的需求,避免10-bit或12-bit這種需要非常大量計算與資料量的表示方法。

How many seconds are needed to transmit a image or video in different methods?

Assume 3G = 2Mbps, 4G = 200Mbps, 5G = 1Gbps.
Today has a 1280*720 4:2:0 @ 60(frames/s) 1 second video waiting for transmit.
How long will it take to run at 3G, 4G, and 5G network speeds?

Mbps v.s. MB/s

  • Mbps
    Million bits per second | 每秒鐘可以傳輸多少百萬位元
  • Mbits/s
    Million bits per second | 每秒鐘可以傳輸多少百萬位元
  • MB/s
    Million bytes per second | 每秒鐘可以傳輸多少百萬位元組
  • 1 Mbps = 0.125 MB/s
  • 8 Mbps = 1 MB/s
  • 20 Mbps = 2.5 MB/s

RGB v.s. YUV

RGB

The RGB color model, also known as the RGB color model or the red, green and blue color model, is a additive color model, the three primary colors are added in different proportions, various colors of light are produced by synthesis.

YUV

YUV takes human perception into account when encoding photos or videos, allowing for reduced chroma bandwidth

References:
RGB vs YUV (YCbCr) color models (AKIO TV)
Introduction to Color Spaces in Video
攝影常見問題彙整:4:2:2 和 4:2:0 分別代表什麼意思?

YUV 4:4:4 v.s 4:2:2 v.s. 4:2:0

YUV typically known as a 2*4 matrix

4:4:4 v.s 4:2:2 v.s. 4:2:0

Calculate how many bit are needed for a pixel in YUV?

If RGB need 3 bytes R:G:B =1:1:1
8 bits : 8 bits : 8 bits >>> 24 bits = 3 bytes

if YUV = 4 : 2 : 0
if YUV = 4 : 2 : 2
if YUV = 4 : 4: 4

JPEG

Single still image compression technique.

JPEG is a type of image format using lossy compression method.

  • Convert RGB to YCbCr
  • Divide image into 8*8 marco block
  • Compute the DCT of each marchblock
  • Divide the DCT values with Quantization Coefficient
  • Encode the value with source coding

The reason why is JPEG more popular than JPEG2000 even though coding performance of JPEG2000 is better than that of JPEG:

  • JPEG2000 has very high computational complexity.
  • JPEG2000 is not compatible with JPEG.
  • JPEG2000 codecs are too slow for both encoding and decoding.

MPEG

MPEG is a type of video format using lossy compression method. MPEG achieves high compression rate, because some frames it only store the difference between two frames, instead of each entire frame.

Standards

  • MPEG (Moving Picture Experts Group)
    動態影像專家小組
    Aim for consumer product, 初期是為了消費性產品制定的標準
MPEG
  • ITU-T (ITU Telecommunication Standardization Sector)
    國際電信聯盟電信標準化部門
    Aim for communication, 初期是為了通訊制定的標準
ITU-T

Goal is to remove temporal redundancy, in order to reduce bit usage in wireless communication.

Later, the two merged to formulate a new standard called “Joint ITU-T/MPEG Standards”.

Thank you for taking the time to read this article, and I sincerely hope that the information provided proves to be valuable to you. Whether you are a student, professional, or simply someone interested in video communication, it is my utmost wish that these notes enhance your understanding and contribute to your success in this field. Thank you once again, and best of luck on your journey in the world of video communication!

--

--