Intro to WebRTC; the Open Source Project Powering our Video Technology (Part I)
Sara Garner, Software Engineer at Houseparty
Video technology is at our product’s core. We leverage WebRTC, an open source technology, to delight Houseparty users who spend 51 minutes chatting in the app, on average, daily. The good news is that WebRTC has made it possible for anyone to build video technology, no matter how big or small the project.
This post is for people intrigued by, but unsure of the resources available for building their own video technology applications.
In this post, the first of our two-part intro to WebRTC, we’ll introduce the technology, explain how to create a basic video capture web app, and upgrade it to establish a peer-to-peer video call.
What is WebRTC?
WebRTC is an open-source project that provides APIs to send almost any type of real-time data, but we will be focusing on how it enables users to send audio and video.
WebRTC was started initially to bring real-time communication capabilities to browsers, but was later expanded to include other types of devices. That initiative included many companies and people. In the end, it defined a set of standards and new APIs to be implemented by the browser, and an open-source implementation that everybody can reuse in his mobile applications or servers.
Prior to WebRTC, it was difficult and expensive to get access to audio and video technology, and would have required a whole team of developers. With WebRTC, just one developer can easily build a basic video chat app.
There are third-party solutions providing higher-level APIs so that you can more easily use WebRTC out of the box, but it is still important to understand the operations under the hood. Knowing what these platforms are doing at the core will help with debugging, and will be very useful as our app gets more complicated, and the technology is pretty interesting.
We can start playing with WebRTC in our own browser, and then work our way up to peer-to-peer calls, and finally multi-party video chat.
Building Browser Video “Chat”
The easiest way to get started with WebRTC is by building a simple browser app that connects to the computer’s camera and microphone, and shows the video stream on the screen.
This introduces us to our first WebRTC API: getUserMedia. This API allows us to stream audio and video, which we will refer to as media, from the user’s media devices, in this case, their camera and microphone.
It also handles getting permission from the user to access these devices. This API can be used to request video in different resolutions if available, as well as from the front-facing or rear-facing camera if this API is called from a smartphone.
An important part of a video chat app is that people are chatting, not just talking to themselves, like in the previous example. To do this, we’ll need to set up a connection between the clients where the clients can agree on how we will be sending the media, and then send the media.
Building Peer-to-Peer Video Chat
WebRTC is designed to allow sending audio and video using peer-to-peer connections.
A peer-to-peer network is created when two or more devices are connected, without a server in between. Unlike other networks, where there is a client and a server, in peer-to-peer networks, each client is both a client and a server.
This means that if Alice is trying to send media to Bob, Alice doesn’t send the media to a server, which then finds Bob’s address, and forwards the data to him. Rather, Alice makes a direct connection to Bob, and he receives the media directly from Alice, and vice-versa.
To make this peer-to-peer connection between Alice and Bob to allow them to video chat, WebRTC provides an API called RTCPeerConnection.
The first thing the clients need to actually send the media to each other is calling the getUserMedia API. This will give them access to the microphone and camera, and connect this stream to the RTCPeerConnection.
Now that we have this RTCPeerConnection, we’ll need to use it to have both clients agree on how they will be sending the media. One of the most important things they’ll need to agree upon is the codec for sending the media.
A codec is the format used to compress the media to make sending it a lot faster; popular codecs include H264 and MP3. WebRTC supports many codecs, so the clients will need to agree on a format both of them can encode and decode.
This information can be described in a standard format called Session Description Protocol (SDP). Exchanging this information between the clients is called signaling, and is typically done using traditional protocols like HTTP or WebSockets.
Finally, to actually establish the peer-to-peer connection between the clients, the RTCPeerConnection uses the Interactive Connectivity Establishment (ICE) protocol. Essentially, a client sends all of its ICE candidates, IP addresses and ports that are available to send media, to the other client. This client then checks all of its ICE candidates to find which are compatible with the ICE candidates it received, and once a combination is found, clients can send media via this connection.
Everything discussed up until this point gave us a good foundation for building one-to-one video chat.
Next, we’ll want to build multi-party video chat like Houseparty, where many people can talk to each other in one conversation. We’ve described the fundamentals of WebRTC’s APIs and protocols, so that we can discuss the architecture required for scaling it up in Part Two.
No matter how you choose to leverage WebRTC in your video chat app, you now understand the core of this impressive technology. The possibilities are endless: go build and improve communication everywhere!
Want to take the next step?
Stay tuned for part 2 of our WebRTC intro, where we will discuss building the architecture for a multi-party app like Houseparty.
About Sara Garner
Sara is a Software Engineer at Houseparty, where she works on the Infrastructure team. Sara joined Houseparty to play a central role in the creation of great experiences for users, from allowing fun, face-to-face connections to prioritizing trust and safety. Sara built the foundation for customized fun facts in the app and automation of our T&S platform. Working on projects like this while keeping Houseparty’s large complex system up is a rewarding experience unique to startup flexibility.