H.265 on WebRTC without Using DataChannel(1/2)

Angcar
6 min readOct 20, 2023

--

Photo by Sticker Mule on Unsplash

Environment

Backend

Frontend

  • VueJS + Typescript

What is WebRTC?

WebRTC (Web Real-Time Communication) is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

https://en.wikipedia.org/wiki/WebRTC

WebRTC is used to stream video and audio from my Raspberry Pi camera. To save network bandwidth and cater to my personal interests, I want to support H.265 on it. Thanks to Chrome, it natively supports H.265 hardware decoder now. If I can stream H.265 video through WebRTC, everything seems feasible.

H.265 on WebRTC

There is no native support of H.265 on WebRTC. Almost every solution on the internet is to pass multi-media data through the data channel. However, there are some reasons for me to reject the solution.

  1. I want to use the same receiving flow for all codes on the client side.
  2. There are some QoS mechanisms on RTP and RTCP.

First Try — Support H.265 on SDP — Failed…

My first question is

RTP supports H.265, and Chrome can play H.265. Why not let RTP send H.265?

Back to WebRTC handshaking stage, a client will collect its capabilities called offer and send it to a WebRTC server to negotiate a final transition codec. SDP is contained in an offer to describe all supported codecs like the following.

m=audo/m=video are lines to separate SDP, and numbers at the end of lines are called RTP payload type. There is no pre-defined payload type for H.265, and there are conventional payload types for H.264, and VP9.

v=0
o=- 266327... 2 IN IP4 127.0.0.1
s=-
t=0 0
a=group:BUNDLE 0 1 2
a=extmap-allow-mixed
a=msid-semantic: WMS
...
m=audio 9 UDP/TLS/RTP/SAVPF 111 63 9 0 8 13 110 126
...
a=rtpmap:111 opus/48000/2
a=rtcp-fb:111 transport-cc
a=fmtp:111 minptime=10;useinbandfec=1
a=rtpmap:63 red/48000/2
a=fmtp:63 111/111
a=rtpmap:9 G722/8000
a=rtpmap:0 PCMU/8000
...
m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 35 36 37 38 102 103 104 105 106 107 108 109 127 125 39 40 41 42 43 44 45 46 47 48 112 113 114 115 116 117 118 49
...
a=rtcp-fb:98 ccm fir
a=rtcp-fb:98 nack
a=rtcp-fb:98 nack pli
a=fmtp:98 profile-id=0
a=rtpmap:99 rtx/90000
a=fmtp:99 apt=98
a=rtpmap:100 VP9/90000
a=rtcp-fb:100 goog-remb
a=rtcp-fb:100 transport-cc
...
m=application 9 UDP/DTLS/SCTP webrtc-datachannel
...

The idea is that I manually add H.265 payload type (I arbitrarily choose a number 110 ) in a client SDP and only add H.265 support in Pion using RegisterCodec function to force WebRTC to use H.265.

The codes are as follows.

// client

// Because a data channel is used,
// and H.265 support should be added in m=video scope
let s = offer.sdp.split("m=application 9 UDP/DTLS/SCTP webrtc-datachannel")

// add payload type 110 at the end of the line
s[0] = s[0].replace("m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 35 36 37 38 102 103 104 105 106 107 108 109 127 125 39 40 41 42 43 44 45 46 47 48 112 113 114 115 116 117 118 49",
"m=video 9 UDP/TLS/RTP/SAVPF 96 97 98 99 100 101 35 36 37 38 102 103 104 105 106 107 108 109 127 125 39 40 41 42 43 44 45 46 47 48 112 113 114 115 116 117 118 49 110")
// add payload type 110
s[0] = s[0] + "a=rtpmap:110 H265/90000\r\na=rtcp-fb:110 goog-remb\r\na=rtcp-fb:110 transport-cc\r\na=rtcp-fb:110 ccm fir\r\na=rtcp-fb:110 nack\r\na=rtcp-fb:110 nack pli\r\n"
// re-join sdp
offer.sdp = s.join("m=application 9 UDP/DTLS/SCTP webrtc-datachannel")
// server

// not support all codecs, only H.265
// m.RegisterDefaultCodecs()

// use 110 as H.265 payload type
m.RegisterCodec(webrtc.RTPCodecParameters{
RTPCodecCapability: webrtc.RTPCodecCapability{
MimeType: webrtc.MimeTypeH265,
ClockRate: 90000, Channels: 0, SDPFmtpLine: "", RTCPFeedback: []webrtc.RTCPFeedback{{
Type: "goog-remb", Parameter: "",
}, {
Type: "ccm", Parameter: "fir",
}, {
Type: "nack", Parameter: "",
}, {
Type: "nack", Parameter: "pli",
}},
},
PayloadType: 110,
}, webrtc.RTPCodecTypeVideo)

Result

The negotiation is successful, and WebRTC tries to use H.265 payload type!

Since Pion lacks H.265 payloader(codes), an error message is shown Set local description failed: the requested codec does not have a payloader. Although there is PR for H.265 payloader, it is still open.

On the client side, Chrome shows an error message for an unrecognized H.265 codec. Uncaught (in promise) DOMException: Failed to execute 'setLocalDescription' on 'RTCPeerConnection': Failed to set local offer sdp: Failed to set local video description recv parameters for m-section with mid='1' .

Changing SDP and using a customized H.265 payload type is not viable.

Photo by Kate Stone Matheson on Unsplash

Second Try — Packatize H.265 in H.264— 👍👍

Following the first try, I start to read the codes of H.264 payloader.

I notice that there are two distinct branches in the Payload function. The first branch handles keyframes that contain SPSs, and PPSs, while the second branch handles P and B frames. The frame types are distinguished by finding start codes (00 00 00 01 or 00 00 01) and checking their NALU types.

Finally, the Single NALU part catches my attention.

func (p *H264Payloader) Payload(mtu uint16, payload []byte) [][]byte {
...
emitNalus(payload, func(nalu []byte) {
if len(nalu) == 0 {
return
}

naluType := nalu[0] & naluTypeBitmask
naluRefIdc := nalu[0] & naluRefIdcBitmask

switch {
case naluType == audNALUType || naluType == fillerNALUType:
return
case naluType == spsNALUType:
p.spsNalu = nalu
return
case naluType == ppsNALUType:
p.ppsNalu = nalu
return
case p.spsNalu != nil && p.ppsNalu != nil:
// Pack current NALU with SPS and PPS as STAP-A
...
}

// Single NALU
if len(nalu) <= int(mtu) {
out := make([]byte, len(nalu))
copy(out, nalu)
payloads = append(payloads, out)
return
}
}
...
return payloads
}

What will happen if I let an H.265 frame pretend to be a single NALU H.264 frame?

The H.265 video codec shares the same start codes as H.264, but the NALU type enumerations are different. The Payload function for a single NALU frame only checks for a single start code pattern.

Therefore, sending all H.265 frames directly to the H.264 payloader is not a correct approach . Keyframes that include VPSs, SPSs, and PPSs require some transformations before being sending, or VSP/SPS/PPS will be ignored due to different NALU type enumerations.

Apply a Fake NALU Type

Append a fake header on keyframes

In the image above, I add a fake header (green block) and change all original start codes (gray blocks in blue blocks) to 00 00 00 00. An arbitrary NALU type 99 00 is a marker to identify whether the frame has been repacked or not.

To be able to recover frames, I put a start code position string 0_28_28_37_65_13_78_49089 to represent [vps start index]_[vps size]_[sps start index]_[sps size]_[pps start index]_[pps size]_[frame start index]_[frame size] . To read strings, I also put string size in 2 bytes before strings.

That’s all. 👍👍

Photo by Thomas Bormans on Unsplash

Result

When establishing a WebRTC connection, I always use H.264 and send H.265 frames to the H.264 payloader. Keyframes are appended with fake headers, while P frames undergo no changes as there is only one start code pattern in a frame. The amount of overhead depends on the keyframe interval, which typically ranges between 1 to 10 seconds or more. This overhead is usually quite minimal.

In the next post, I will show you how to display H.265 frames on Chrome.

--

--