GStreamer H264 Encryption Plugin

Oguzhan Oztaskin
6 min readJul 4, 2024

--

I have created a new GStreamer plugin for encrypting and decrypting H264 streams without breaking the stream. This way, the encrypted stream can be saved to the disc in MP4 format, streamed via RTSP, its alignment changed, all while maintaining proper decryption. Note that this plugin is experimental; for robust security, consider DRM solutions.

So, why did I bother with this then? Because I have worked on GStreamer a fair bit but I did not have anything to show until now. This project summarizes the most critical things I have learned, which are listed below:

  • Inserting frame perfect metadata into H264 streams and reading it back.
  • Creating GStreamer plugins in C. This requires understanding of GLib type-system, which seems to be a rare skill.
  • Understanding of H264 stream structures, its NAL units, modifying streams at low level.
  • Understanding of GStreamer, its source code, and my debugging skills on it.

Also it demonstrates my cryptography understanding:

  • Modes of encryption: CTR, CBC, ECB etc.
  • Correct use of IV / encryption in a lossy medium. Encrypted H264 packets can get lost but this does not fail the decryption process.

You can find the repository here.

I will explain how I utilized such knowledge in this project below.

Lock icon by Freepik

Plugin Overview

This plugin contains two new elements: `h264encrypt, h264decrypt`. They are used in the most intuitive way:

gst-launch-1.0 videotestsrc pattern=ball ! nvh264enc ! \
h264encrypt iv-seed=31357476677378414 key=01234567012345670123456701234567 encryption-mode=aes-ctr ! \
h264decrypt key=01234567012345670123456701234567 encryption-mode=aes-ctr ! \
nvh264dec ! glimagesink

Here, h264encrypt receives h264 encoded frames and encrypts them with the given key. IV seed is used to initialize random IV generator and encryption mode is just the encryption mode.

Then, h264decrypt receives the encrypted frames from h264encrypt element and decrypts them with the given key. This produces the exact same stream back. After this point, it can be used as h264 again.

Thus, what did we gain here? Nothing, we just encrypted and quickly decrypted it back. However, assume that you are streaming over an unsafe medium. Your packets could be listened and your video stream could be observed. To avoid that, you can simply do:

# To stream
gst-launch-1.0 videotestsrc pattern=ball ! nvh264enc ! h264parse config-interval=1 ! \
h264encrypt iv-seed=31357476677378414 key=01234567012345670123456701234567 encryption-mode=aes-ctr ! \
h264parse ! rtph264pay ! rtpsink address=127.0.0.1 port=1025

# To receive and view
gst-launch-1.0 rtpsrc address=127.0.0.1 port=1025 encoding-name=H264 ! rtph264depay ! h264parse ! \
h264decrypt key=01234567012345670123456701234567 encryption-mode=aes-ctr ! \
nvh264dec ! glimagesink

With these simple bash commands, you can stream your H264 encoded video over RTP with AES encryption.

How It Works

H264 stream consists of NAL units. This plugin accepts AU-aligned H264 streams in byte-stream (Annex-B) format. That means, h264encrypt receives H264 buffers that carry the data between two consecutive AU NAL units. I will call them “frames” from here on, though they need not to carry exactly one frame.

For each such frame, I iterate all NAL units and detect the first slice NAL unit of any kind (see last paragprah of this section). This kind of NAL units carry the video data and their payload is what I encrypt. However, for some encryption modes supplying only the key is not enough, they also need IV.

IV is a necessary public data and used both in encryption and decryption. Since it is public, I simply put it in H264 stream as Supplemental Enhancement Information (SEI) so that h264decrypt can later retrieve it and initialize its encryption context.

Thus, each slice-carrying H264 frame has one extra SEI NAL unit that has IV used for that frame. This IV is used to initialize the encryption context. It is positioned right before the first slice NAL unit and all slice NAL unit payloads of that frame is encrypted with that context. That completes the encryption process.

Decryption is similar. h264decrypt iterates all NAL units and finds the first SEI that carries the IV and initializes its decryption context. Then, h264decrypt removes that SEI and decrypts the frame using this decryption context.

Note that a stream can be encrypted and decrypted multiple times. If a frame is encrypted multiple times, multiple IV carrying SEI are inserted into the stream where the SEI of the most recent encryption is the first IV carrying SEI found in the stream from the start of the frame. During multiple decryption, the first SEI observed is used to initialize the decryption process and it is then removed from the stream.

Implementation

I kind of took the idea from this discussion and also received help from ndufrense to figure out what part of H264 stream to encrypt and decrypt. I used the gstreamer plugin template repository to start coding the plugin.

I used the following to implement the functionality described in the previous chapter:

  • Parsing and iterating NAL units: I used gst h264 nal parser for this.
  • Finding slice NAL unit payload: Same as above. I used gst_h264_parser_parse_slice_hdr to parse the header and then calculated payload offset and size.
  • Storing IV in H264 Stream: I used gst codecparsers to create SEI of payload type “user data unregistered” and stored the IV in it. This gave me a GstMemory * for SEI, which contains SEI NAL unit.
  • Padding: I applied bit padding to the stream as it is the simplest to implement. I needed this to each slice payload.
  • Emulation Prevention Bytes: Rarely, encrypted slice payloads contain illegal byte sequences. I inserted emulation prevention bytes (see here) in h264encrypt and removed them in h264decrypt.
  • Random IV generation: I generate random IVs using random_r method of stdlib, whose seed is initialized with “iv-seed” property of h264encrypt. I allowed the use of a seed for the sake of easy reproduction of an encrypted stream for debug purposes. It is possible to use other IV generators by responding to “iv” signal of h264encrypt.
  • AES cipher: I used “Tiny AES in C” implementation provided here.
  • Elements: I used gstreamer plugin template as base repository and used GstBaseTransform as base to implement h264encrypt and h264decrypt elements.

Limitations

  • It is very slow as encryption is on CPU and single threaded.
  • This is a non-standard encryption scheme for H264. Thus, you have to use this plugin on both streaming and receiving sides to encrypt and decrypt it. Existent tools will not support it.
  • This scheme is experimental and comes with no guarantee. You might want to look into DRM and/or other tools for this task.
  • Key property is visible when used with gst-launch-1.0 and this might cause accidental leaks of keys. This plugin should be used with gst-launch-1.0 command only for debugging purposes as it leaks the key in several ways (ie. ps command lists gst-launch-1.0 command args). The correct use of the plugin is inside of a GStreamer application. Key is write-only and not visible in bin graphs.
  • Absolutely no guarantee of any sort.

Afterthoughts

The implementation was not hard as GstBaseTransform provided a good point to start writing elements, gst h264 nal parser was easy to understand, and I received help in the GStreamer Discourse forum. There were some confusing offset calculations and for some stuff I only had struct member name to guess what it is and how it is used. However, I did not lose much time on them. At the end, this plugin worked as I intended: Encryption before streaming over RTP, still works after changing stream properties, packet loss does not break it. It is just a bit slower than I expected. Maybe a faster algorithm or AES implementation can be used. Maybe AES-CTR mode can be parallelized to get some speed-up. However, these are out-of-scope of this project.

I hope you enjoyed the read.

If you missed, here is the repository.

--

--