Video Communication Course 2022 (Part3)

Nick Pai
4 min readJun 18, 2023

--

The following information represents the comprehensive notes from the ‘’Video Communication Course 2022 (Part3)’’. These notes provide the fundamental concepts, principles, and practical applications covered in the course. They serve as a valuable resource for students and professionals seeking to enhance their knowledge and skills in the field of video communication.

Part3 contains the following sections

  1. I, P, B-frame
  2. Motion Estimation
  3. Coding Delay
  4. H.264

I, P, B-frame

Intraframe (I-frame)

  • Separate encoding of a single image

Interframe (P-, B-frames)

  • Using information from other frames

Compression Size: I > P > B

I Frame (Intra pictures)

  • Can only contain node macro blocks, just like the traditional compression of pictures one by one.
  • I frame is usually the first frame of each GOP (Group of pictures)
  • It can be decompressed into a single complete picture through the video decompression algorithm.

P Frame (Predicted pictures)

  • It can contain node macroblocks or predicted macroblocks. Compared with the previous frame (frame), the encoder does not need to record the pixels that have not changed in the P-frame.
  • It needs to refer to an I frame or B frame in front of it to generate a complete picture.

B Frame (Bi-directional pictures)

  • Can contain node, forecast and forward and backward forecast macroblocks.
  • You need to refer to its previous I or P frame and a subsequent P frame to generate a complete picture.

Motion Estimation

Find the best match in the 15x15 space, there are a total of 15*15=255 possibilities.

Motion estimation vector (X_axis, Y_axis)=(4_bits, 4_bits)

From 0 to 15 total 16=2⁴ numbers >>> 4 bits

Motion estimation vector only needs 8 bits to represent 16*16 pixels.

8_bits / 16*16 bits >>> high compression ratio

Coding Delay

H.264

Features of H.264:

  1. Variable block size
  2. Lagrange Multipler
  3. Multiple Reference Frame
  4. Intra Prediction
  5. Deblocking Filter
  6. Entropy Coding
  7. Integer Transform

Variable block size

  • Advantage
    - easily to find best match block, the smallest distortion is the best
    - can get more similar block
    - increase accuracy, the difference is smaller
    - improve the coding performance
  • Disadvantage
    - use more motion vector to represent a 16*16 block
    - e.g. 4*4 block needs 16 vector, but 16*16 block only need 1 vector
  • Choosing standard
    - minimal distortion
    - lowest bit (用最少的bit就可以表示), a.k.a RDcost

Use rate and distortion to select the best block. Previous methods only consider distortion, try to find the lowest distortion block. H.264 consider both distortion and RDcost.

Lagrange Multiplier (拉格朗日乘數)

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints
數學中的最佳化問題中,是一種尋找多元函數在其變數受到一個或多個條件的限制時的局部極值的方法。

  • Theorem 理論方法
    - find shortest distance λ
  • Practice 實際方法
    - try all possibility
    - find the one has minimum distortion and use least bit

Multiple Reference frame

Previous method e.g. MPEG-1 only reference 1 previous and 1 later frame. H.264 can reference 32 frames bidirectionally.

Intra Predication

Inter-frame (P-frame, B-frame) in H.264 can do intra-prediction.

  • Intra-frame (I-frame)
    - encoding by itself (e.g. JPEG)
    - intra-coding
  • Inter-frame (P-frame, B-frame)
    - inter-coding
    - intra-coding

H.264 key feature, inter-frame can use intra-coding (a.k.a self-prediction, intra-prediction). Do the prediction (9 kinds) try to find the one which is similar to thte original.

Deblocking Filter

Blocking effect — discontinuous between the boundary

  • Traditional
    - Do deblocking after decoder a.k.a post-processing.
    - e.g. low-pass filter can eliminate blocking effect
    - Disadvantage : We couldn’t know the discontinuous boundary is real edge or blocking effect
  • H.264
    - Place deblocking filter in encoder side
    - Advantage : We can know where is the real edge. It can provide good reference frame for the next frame encoder.

Thank you for taking the time to read this article, and I sincerely hope that the information provided proves to be valuable to you. Whether you are a student, professional, or simply someone interested in video communication, it is my utmost wish that these notes enhance your understanding and contribute to your success in this field. Thank you once again, and best of luck on your journey in the world of video communication!

--

--