WebRTC Video Chat App Development

Suminda Niroshan
The Startup
Published in
11 min readOct 12, 2019

You have probably used Skype, WhatsApp, Viber and tons of other video chat applications. Did you know that it’s possible to develop a similar kind of video chat web application with just HTML and pure JavaScript using an in-built technology in browsers? That technology is called WebRTC.

Prerequisites

You need to have an AWS account and some basic knowledge working with AWS services and how to deploy with Serverless Framework. Following AWS services will be utilised throughout this guide.

  • Lambda Service
  • AWS EC2 Instance (Will be created and configured manually)
  • AWS Websockets
  • AWS DynamoDB
  • AWS S3

Make sure to have the following installed,

  • Node 6.0 or later
  • AWS CLI Configured for your account

We’ll be using Serverless Framework to configure and deploy the whole infrastructure needed. So we don’t need to go to AWS console and create resources manually (Except for the EC2 instance). Please refer to this blog post if you need help on setting Serverless Framework up.

Since AWS Websockets are also being used, I recommend reading this blog post as well to get a better understanding of AWS Websocket basic usage. This project is built on top of that project.

You will learn

How to leverage WebRTC with other required mechanisms and components to build a Video chat application in a web application.

WebRTC (Web Real Time Communication)

WebRTC (Web Real-Time Communication) is a free, open-source project that provides web browsers and mobile applications with real-time communication (RTC) via simple application programming interfaces (APIs). It allows audio and video communication to work inside web pages by allowing direct peer-to-peer communication, eliminating the need to install plugins or download native apps.

This communication is extremely fast with low latency as the connection is peer to peer and there’s no server in the middle. This technology is developed by Google and currently almost all browsers support it except in iOS platform (Only Safari supports). Different API methods are supported in different browsers which can make the development of these applications difficult. Hopefully these differences fade away over time. You can find API methods and their browser support in this link.

Why Do We Need It

If we want two browsers running in two different parts of the world to connect and exchange real time video chat streams, we need a TCP or UDP connection like below.

WebRTC helps us achieve this connection.

How Does It Work

Usually our PCs, Laptops or mobiles exists in a local area network behind a NAT, a Router and a firewall. In order to make a direct connection with browser to browser, we need to deal with all of these.

Having these layers between browsers that try to make a direct peer to peer connection makes it impossible to connect without the involvement of signalling mechanisms.

In order to deal with these, following components are needed,

  • A Signalling Mechanism
  • STUN Server
  • TURN Server
  • Interactive Connectivity Establishment (ICE)
  • Signalling Process

Let’s have a look at these.

Signalling Mechanism

Before starting video and audio streaming between two browsers, we need to establish who is participating. In order to do this, both parties needs to be able to exchange metadata before attempting to stream. Websockets are the most used for this. In this demo, AWS Websockets will be used.

Once this socket connection is established, it will be used to send back and forth the data needed to create a peer to peer connection.

ICE (Interactive Connectivity Establishment)

This is a coordination standard for Stun and Turn servers to establish a direct communication between peers. Multiple ICE candidates are generated during the initial signalling process for connection establishment and peers will pick the one that succeeds.

STUN (Session Traversal Utilities for NAT) Server

This is a server that enables a peer to find it’s public IP address. When a peer wants to connect to another host, it will provide this public IP address as a potential connectivity endpoint. If the existing NAT between the two peers allows for a direct connection between the two hosts, then a direct peer to peer connection will be made using an ICE (Interactive Connectivity Establishment).

In this case, a Turn server is not needed. If the Stun server fails to provide a working ICE, WebRTC will then attempt with the Turn server.

TURN (Traversal Using Relay around NAT) Server

As the name implies, this server relays media between the hosts that are connected to it using an ICE. When you want to make a call between different networks or when NAT won’t allow direct access to a host, this is the way to go. Turn server acts as a relay between peers so the peers doesn’t need to find ways through their NATs. Turn server has a public facing IP address which peers can connect to.

The good thing is, we don’t have to worry about this (When to use Stun or Turn) as the evaluation and connection establishment is done automatically for us by WebRTC engine.

But, we do need to create and host our own Stun/Turn server (We’ll do that in a further section).

Signalling Process

Once the web socket connection is in place, following is the process that goes onto establishing a peer to peer connection.

[1]. Caller and Receiver connects to each other via a Websocket connection.
[2]. Caller creates an Offer.
[3]. Caller sends the Offer to the Receiver via the Websocket connection.
[4]. Caller receives ICE candidates from the Stun/Turn server.
[5]. Caller sends all ICE candidates to the Receiver via the Websocket connection.
[6]. Receiver accepts the Offer.
[7]. Receiver accepts the ICE candidates sent from the Caller.
[8]. Receiver creates an Answer.
[9]. Receiver sends the Answer to the Caller via the Websocket connection.
[10]. Caller accepts the Answer.
[11]. Receiver tests the ICE candidates sent from the Caller and picks the one that is successful in making a connection.
[12]. A peer to peer connection is established between the Caller and the Receiver through an ICE connection

Demo Chat Application Build

Before we dive into the application build let’s break down the components and how we implement each of them. You can find the Github Repo from this link.

  1. Client Application
    This will be implemented just using a simple HTML page and pure JavaScript. No JavaScript libraries. Not even WebRTC helper libraries are used. I want to keep things simple, under control and not get lost in a framework hell.
  2. Signalling Mechanism (Web sockets)
    As mentioned before, AWS Websockets will be used. This is the same mechanism used to develop a simple chat application as shown in this blog post which I have explained in detail.
  3. Stun/Turn Server
    We will be using the most popular Stun/Turn server open source project called Coturn. This will be hosted in an Ubuntu AWS EC2 instance.
  4. Project Deployment
    Serverless Framework
    will be used to deploy the infrastructure (Websocket server) that is needed into AWS. If you don’t know how this works, take a look at this blog post.

1. Deploying The AWS Websocket

This project contains a Serverless Framework project configuration for an AWS Websocket which transmits received messages to all connected clients.

1. Get the source code from this Github Repo and go in to the project directory from your CLI.

2. Install required npm packages first.
$ npm install

3. Install Serverless Framework globally.
$ npm install -g serverless

4. Deploy into AWS
$ serverless deploy --stage websocket

After the deployment, you should get the AWS Websocket endpoint as displayed below. Copy this URL as it is needed for our web client application configuration.

2. Hosting The Stun/Turn Server

We’ll be hosting this in an AWS EC2 Ubuntu virtual machine instance.

  1. Go ahead and create an AWS EC2 instance with the following image (I have used “ami-02df9ea15c1778c9c”). You can keep all the default settings.

2. Go to EC2 Security Group settings and open all in-bound UDP ports as displayed below. This is required for the Stun/Turn server to function properly.

3. SSH into the EC2 instance and execute the following commands.

a) Acquire super user environment.
sudo -i

b) Install required libraries.
apt-get update && apt-get install libssl-dev libevent-dev libhiredis-dev make -y

c) Install Coturn project.
apt install coturn

d) Exit from super user environment.
exit

e) Start the Coturn turn server.
turnserver -a -o -v -n -u USERNAME:PASSWORD -p 3478 -L EC2_Private_IP -r someRealm -X EC2_Public_IP/EC2_Private_IP --no-dtls --no-tls

Make sure to replace USERNAME and PASSWORD with the ones you want.
Replace EC2_Public_IP and EC2_Private_IP with your EC2 instance’s IP addresses as displayed below.

f) Verify Coturn is running by executing the following command as displayed below.
netstat -lnp | grep 3478

3. The Client Application Build

Before deploying the client WebRTCChat.html file, let’s have a look at the code.

Upon page load, user will be asked to give permission to access camera and microphone using getUserMedia method. After access is given the above code is executed to establish a web socket connection and setup event handlers to handle Offer/Answer and ICE Candidates.

  • onmessage handler will handle received socket messages according to their type (ICE Candidate, Answer or Offer).
  • onopen handler will setup a WebRTC Peer Connection once the web socket connection is successful with createRTCPeerConnection method displayed below.

Here the video and audio streams are added to the RTCPeerConnection so that it can be streamed when a peer to peer connection is successfully established.

  • ontrack handler will be fired whenever a video or audio stream is received from RTCPeerConnection and will set these to a video element for playback.
  • ondatachannel handler will be fired whenever the created data channel is received from the other peer (Caller). This channel will be used to send text messages between the two peers.
  • onicecandidate handler will be fired whenever an ICE Candidate is received from Stun/Turn server. This candidate will be sent to the other peer as soon as it is received.

Please note that above order of execution for creating web socket and WebRTC peer connection will be the same for both peers (Caller and Receiver).

Code Execution Flow for Establishing a Peer to Peer Connection

1. Both Caller and Receiver will open the file WebRTCChat.html in their browser. This will execute connectToWebSocket and createRTCPeerConnection in both peers.

2. One user (Caller) will click “Call” button and this will execute the following function.

An Offer will be created and saved in Caller’s session with setLocalDescription function. Afterwards, the Offer will be sent to the other user (Receiver) over the web socket.

3. As soon as the Offer is created,onicecandidate event handler will be fired and the generated ICE Candidates will be sent to the Receiver via the web socket.

4. On the Receiver’s end, handleOffer function will be executed and the Offer sent from the Caller will be saved in Receiver’s session with setRemoteDescription function.

5. On the Receiver’s end, handleCandidate function will be executed and ICE Candidates received from the Caller will be added to the Receiver’s RTCPeerConnection.

6. Receiver will click “Answer” button and this will execute the following function.

An Answer will be created and saved in Receiver’s session with setLocalDescription. Afterwards, the created Answer will be sent to the other user (Caller) over the web socket.

7. On the Caller’s end, handleAnswer event handler will be fired and the Answer sent from the Receiver will be saved in Caller’s session with setRemoteDescription function.

Now a peer to peer connection should be established between the two peers.

Please have a look at the full source code for the WebRTCChat.html file below.

Configuring and Hosting The Client Application in S3

Open the WebRTCChat.html and replace following values and save changes.

WEBSOCKET_URL - Replace this with the web socket URL copied in “step 1”.

TURN_SERVER_IP_ADDRESS- Replace this with the EC2 public IP address copied in “step 2”.

TURN_SERVER_PORT- Replace this with port value provided when starting Coturn in “step 3.d” (Port value 3478).

TURN_SERVER_USERNAME- Replace this with the username you provided when starting Coturn in “step 3.d”

TURN_SERVER_PASSWORD- Replace this with the password you provided when starting Coturn in “step 3.d”

In order to make the application accessible from the internet, it needs to be hosted in AWS S3 bucket.

Deploying to AWS S3 Bucket

  1. Go to AWS Console and create an S3 bucket.
  2. Make the bucket public as displayed below.

3. Upload the WebRTCChat.html file to the created S3 bucket and make it public as well.

Testing

  1. Now open the WebRTCChat.html file using the public URL (Displayed above) in both Caller and Receiver’s browsers.
  2. Caller clicks on “Call” button.
  3. Receiver clicks on “Answer” button.
  4. You should be able to have a video call with each other as displayed below and the text chat should be working as well.

Here you can see me making a video call (From Sri Lanka) with my friend in Japan via iOS Safari and it works perfectly.

Compatibility

As of now this application is tested in Windows Chrome Version 77.0.3865.90 (Official Build) (64-bit) and iOS 13.1.2 Safari browser.

In iOS WebRTC is not supported in any other browsers except Safari.

Go to this link to see supported platforms.

Summary

WebRTC support is available not only in browsers but on other platforms natively as well (Via Libraries). But the concepts is the same and now you know how you can leverage it to stream data (Files, Texts, Images, Video and Audio) real time in any of your application as needed.

Cheers!

References

--

--