Communicate in new ways with WebRTC

Published in

net magazine

9 min readMay 20, 2016

Shwetank Dixit introduces WebRTC and explains how it can be used to enable real-time video communication in applications

WebRTC is an amazing technology that enables peer-to-peer communication for the web and a whole host of next-generation, realtime web applications including video-chat apps, web-based multiplayer games and peer-to-peer file sharing applications. Let’s take a look at some lesser known but useful things to keep in mind while building applications with WebRTC.

getUserMedia

Getting access to the camera and microphone is easy with getUserMedia. However, it’s possible to utilise various other web technologies to enhance the experience. Most people opt for the same approach when displaying the camera output: simply showing it in a rectangle. There are ways to manipulate the camera output and add effects to it in real time.

For example, by creating a separate <canvas> element and copying the camera output to that element, where it can be manipulated. However, there is a much more straightforward way to create some simple yet impressive effects — plain CSS. You can easily apply effects like CSS filters, masks and blend modes to the <video> element displaying your camera output.

Let’s look at how we can apply CSS filters to the camera output. First mark up the video and the buttons we want on our page:

<video id=”thevid” autoplay=”true”></video>
…
<div id=”sepia” class=”controloption”>Sepia
 <input type=”range” id=”sepiacontrol” name=”sepia”
value=”0" min=”0" max=”100" />
</div>
<div id=”saturate” class=”controloption”>Saturate
 <input type=”range” id=”saturatecontrol” name=”saturate”
value=”0" min=”0" max=”100" />
</div>
 …

… and so on for all the other CSS filters.

Now let’s get the camera output going.

navigator.getUserMedia = navigator.getUserMedia ||
navigator.webkitGetUserMedia || navigator.mozGetUserMedia
|| navigator.msGetUserMedia;
window.URL = window.webkitURL || window.mozURL ||
window.msURL;
navigator.getUserMedia({audio: true, video: true}, gotStream,
logError);
function gotStream(stream) {
 if (video.mozSrcObject !== undefined) {
video.mozSrcObject = stream;
} else {
video.src = (window.URL && window.URL.
createObjectURL(stream)) || stream;
 }
};
function logError(e) {
 console.log(e);
};

Now we can attach event listeners to all the buttons.

var grayscale = document.querySelector(“#grayscalecontrol”);
var sepia = document.querySelector(“#sepiacontrol”);  
…
grayscale.oninput = function (e) {
 video.style[“-webkit-filter”] = “grayscale(“+this.value+”%)”;
 video.style[“filter”] = “grayscale(“+this.value+”%)”;
}
sepia.oninput = function (e) {
 video.style[“-webkit-filter”] = “sepia(“+this.value+”%)”;
 video.style[“filter”] = “sepia(“+this.value+”%)”;
}
…

You will find the complete code in the GitHub repository. It is also possible to use other properties, such as blend modes, 2D and 3D transforms and more to liven up the camera experience. For example, try adding the following CSS to the <video> element displaying the camera output, in a browser which supports CSS blend modes and has it enabled.

body {
 background-color: red;
}
video {
 mix-blend-mode: multiply;
}

With this you can see how easy it is to apply great effects in real time using CSS. It’s also a good idea to provide some subtle UI enhancements to the camera output, like a small box-shadow on the video (you can see this employed on sites like appear.in).

Try the following CSS on the video element:

video { box-shadow: 0 0 20px 0 rgba(0,0,0,.25); }

As you can see, the story doesn’t end when you display the camera output on the page. Touch it up and enhance it with CSS!

**Chat tools** Tools such as appear.in (above) use WebRTC and AngularJS to enable two or more people to chat

The power of data channels

Data channels have been designed to work in quite a similar way to WebSockets. You have the same event handlers, like onmessage, onerror, onopen and onclose, and the send() method works similarly too. However, there are quite a few things which set data channels apart.

For example, data channels can give you more options on how to send data to the other side. As fast as WebSockets are, they still have to pass through a WebSocket server in-between, as a relay to the other side. Data channels, on the other hand, can be peer-to-peer. This means there’s no server in-between, thereby reducing latency and speeding up the process further.

WebSockets use TCP as their method of transmission. TCP packet transmission is meant to be ordered and reliable. This means the packets of data will be received in the same order they’re sent, and every packet of data is guaranteed to be received on the other side.

Data channels, on the other hand, don’t use TCP. Instead, they go over UDP with a bunch of other protocols on top of it. In data channels, you get the following options:

ordered — This specifies whether you want all the data packets to go in an ordered fashion or not. In some cases (such as text chat) you might want that, and in other cases (like games), lower latency might be more important
maxRetransmitTime — The maximum length of time data channels will spend trying to send a failed message
maxRetransmits — The maximum number of times the data channel will try to send a failed message

So, by default data channels work in an ordered and reliable way, just like WebSockets. However, you can choose for them to work in a ‘partially reliable’ way. This means if a packet is not delivered, data channels can try resending it, but only up until a certain limit.

This can be a limit on the number of tries:

peerConnection.createDataChannel(“datachannel”, {ordered:
false, maxRetransmits: 5});
Try for 5 times before giving up on a failed message

Or a limit on time spent attempting to deliver:

peerConnection.createDataChannel(“datachannel”, {ordered:
false, maxRetransmitTime: 2000}); //Keep trying for 2
seconds before giving up on a failed message

Knowing the various options to use with data channels, and when to use one, can really make a difference in your web application.

**Filters** With WebRTC, you can add effects to camera output in real time

Promises

The way to work with getUserMedia is by using navigator.mediaDevices.getUserMedia, which is meant to work unprefixed (while navigator.getUserMedia, with the various prefixes each browser allows, is getting phased out).

This is still a new feature, and it might take some time for it to land in browsers. One important thing to understand is that with this change, navigator.mediaDevices.getUserMedia will be using promises instead of the callback-based approach we currently have. Right now, in Chromium, this is behind a flag (you need to enable the flag “enable experiment web platform features‘’). However, in the future this should be exposed by default, and would be the recommended way to go.

Pretty much every modern API on the web in the last year has been moving to a promise-based approach, so it makes sense to get getUserMedia to do so too. However, navigator.getUserMedia will still remain part of the spec for the time being. You will be able to use it, and it will use the same callback based approach as always, rather than a promise based one which navigator.mediaDevices.getUserMedia has. Also, In Chrome and Opera at least, getUserMedia cannot run on http sites anymore. They need to be on https to run.

Embrace HTTPS

One issue you might wish to consider when building your WebRTC app is that often users fail to notice the permission dialog for camera and microphone access. You may have noticed a lot of WebRTC-based sites have an arrow or other pointer to direct users to the browser’s permissions dialog, as a result.

Adding something in the web app’s UI that will make the user notice the browser permissions dialog is a necessary but tricky task. The approach is complicated by the fact that the permissions dialog can be featured in different areas, depending on the browser brand, but also on the device. For example, one browser might have the dialog at the top of the page on desktop, but on the bottom of the page for mobile.

However, HTTPS significantly helps in this issue. The browser remembers the site permissions on HTTPS. This means if your app uses HTTPS, the browser will ask the user for permission to use the camera and mic the first time, then the next time user returns to your site using the same browser, they won’t be shown the permissions dialog again. As it is, if you’re using the camera and microphone for anything other than a simple demo, it is important you use HTTPS. Not having HTTPS might make your users vulnerable to man-in-themiddle attacks. So for the benefit of your users and your app, try to use HTTPS!

**Permissions** An example of a site using an illustration to explicitly show the user to click the browser permissions dialog to enable mic and camera access

CPU and bandwidth considerations

If you are doing peer-to-peer video chat with WebRTC, you will need to keep in mind the fact that things get harder as you add more and more people into the call. The below diagram is an example of a typical ‘mesh’ architecture — where every peer is connected to every other peer. If there are two people in such a setup, it’s fine. Even with three it’s OK. However, once you start including four or more people, it starts to push the CPU’s limits.

This is because, in WebRTC, every stream is encrypted. The more people are added into the call, the greater the volume of stream each peer will have to encrypt (and decrypt, from the other side) in real time. This is especially problematic if you have an old computer or a mobile device.

**Mesh architecture** Where every peer is connected to every other peer

There are also bandwidth considerations to keep in mind. At the time of writing, the browsers that supported WebRTC all supported VP8 as the video codec and OPUS as the audio codec. The following list shows how much bandwidth they typically use (assuming video as 30FPS):

For a 720p video, VP8 takes 1–2 Mbps
For a 360p video, VP8 takes around 0.5–1 Mbps
OPUS takes around 0.06–0.5 Mbps. However, OPUS can seamlessly switch to utilise a lower or higher bandwidth, depending on the prevailing network conditions

In general, it’s always advisable to go for a smaller video by default, especially if your app requires more than two people doing a video chat.

The future of WebRTC

While the WebRTC specification was being worked on, a few people wanted a slightly different direction and formed a W3C Community Group to work towards this. Their suggestions for WebRTC — called ORTC — relied on having a much lower-level control of WebRTC.

ORTC does away with the exchange of SDPs, which are currently required in signalling for a WebRTC call to work. There is also talk of making shim libraries so that once browsers support WebRTC 1.1, existing applications that want to use the 1.0 version of WebRTC can still easily do so.

It will be interesting to keep an eye on the future of WebRTC and what WebRTC 1.1 will bring to the table. Microsoft is supporting ORTC on the Edge browser and has signalled that it is working on supporting WebRTC 1.0, but it is not ready yet as of now. Apple has also indicated that its working on supporting WebRTC in Webkit, which is great news as it means that now all major browsers either support WebRTC, or are currently working on supporting it.

Conclusion

WebRTC is an amazing technology that promises the next big development in communication applications on the web. As with every great emerging technology, there are a few important points to note and watch out for in order to make the best experience possible. Now go forth and play with WebRTC, and make the next great WebRTC app!

Shwetank Dixit is a web evangelist at Opera. His areas of expertise are HTML5, offline web technologies and WebRTC.

This article originally appeared in issue 264 of net magazine.