Basics of WebRTC

Balaji Pastapure
c-buddy
Published in
7 min readJan 3, 2021
source : cuelogic.com

WebRTC, the web real-time communication, a standard that enables developers to create live audio, video and, text chat applications, whether on web or mobile devices, in this article I’ll be explaining its implementation on web, so that you can start creating your own live chat web application or site.

The standard is supported by a wide variety of browsers but, they differ in some details, the implementation explained in this article should get you up and running on Google Chrome as well as Mozilla Firefox, as I’m still working on its usage on other web browsers like Apple Safari and, Microsoft Edge

The difference between WebRTC and SIP Trunking:

  • WebRTC : is a protocol specification that allows for real-time video and audio communications between web browsers and mobile applications.
  • SIP Trunking : is a means of operating phone systems over the internet, instead of using a traditional phone line, based on SIP for establishing and managing connections between users.
  • The subtle differences between these two technologies mean that together, they unlock a staggering range of applications for the next generation of communications.
  • Most experts predict that WebRTC won’t replace legacy VoIP infrastructure, but WebRTC applications offer easy peer-to-peer voice and video communication in situations where a standard phone call isn’t optimal. To give you a sense of what WebRTC is capable of, and how it can be used, here are some applications that leverage WebRTC technology to deliver some awesome user functionality:

Who uses WebRTC ?

  • Google Duo
  • Discord
  • WhatsApp
  • GoToMeeting
  • Houseparty
  • Snapchat
  • Amazon Chime
  • SLACK

Some of these applications leveraged the flexibility of WebRTC to integrate real-time communications into existing infrastructure (like SIP trunks and other VoIP setups), whereas others built from the ground-up on browser-based real-time communications. Either way, all of these examples illustrate the immense progress that has already been made in real-time communications and hint that the WebRTC revolution is just beginning

How it works ? :

WebRTC works in many diffrent ways in these days but most conman/ basic

path include following entities

  • Signalling:
  • Connecting
  • Securing :
  • Communicating :

WebRTC signalling refers to the process of setting up, controlling, and terminating a communication session. In order for two endpoints to begin talking to one another, three types of information must be exchanged:

  • Session control information determines when to initialize, close, and modify communications sessions. Session control messages are also used in error reporting.
  • Network Data reveals where endpoints are located on the Internet (IP address and port) so that callers can find callees.
  • Media Data is required to determine the codecs and media types that the callers and calles have in common. If endpoints attempting to start a communications session have differing resolution and codec configurations, then a successful conversation is unlikely. Signaling that exchanges media configuration information between peers occurs by using an offer and answer in the Session Description Protocol (SDP) format

Why Are Signalling Servers for WebRTC Needed?

When WebRTC applications are said to operate entirely “in-browser,” the perspective is taken from the end user’s point of view. Yes, WebRTC app users require nothing beyond their browsers; but underneath the hood, developers must craft server-side solutions to get peers (i.e. browsers) to communicate with each other. This is how the infrastructure of a communication platform, such as the OnSIP Communications Platform as a Service (CPaaS), becomes useful.

https://www.onsip.com/

In a nutshell, WebRTC signaling allows for users to exchange metadata to coordinate communication. RTCPeerConnection is the API WebRTC uses to establish peer connections and transfer audio and video media. In order for the connection to work, RTCPeerConnection must acquire local media conditions (resolution and codec capabilities, for instance) for metadata, and gather possible network addresses for the application’s host. The signalling mechanism for passing this

What are STUN and TURN?

WebRTC is designed to work peer-to-peer, so users can connect by the most direct route possible. However, WebRTC is built to cope with real-world networking: client applications need to traverse NAT gateways and firewalls, and peer to peer networking needs fallbacks in case direct connection fails. As part of this process, the WebRTC APIs use STUN servers to get the IP address of your computer, and TURN servers to function as relay servers in case peer-to-peer communication fails. (WebRTC in the real world explains in more detail.)

WebRTC Protocols :

  • SDP :

Session Description Protocol it is used to describe multimedia sessions in a format understood by the participants over a network. Depending on this description, a party decides whether to join a conference or when or how to join a conference.

  • ICE :

Interactive Connectivity Establishment (ICE) is used in problems where two nodes across the Internet must communicate as directly as possible, but presence of NATs and Firewalls make it difficult for nodes to communicate with each other. It is a Networking technique which makes use of STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays Around NAT) to establish a connection between two nodes which is as direct as possible.

NAT : Network Address Translation

source : sonus.net

To understand how ICE works, we need to know the workings of STUN protocol and its extension TURN protocol.

  • STUN (Session Traversal Utilities for NAT):
    For an endpoint under a NAT which has a local address, it is not reachable for other endpoints outside the local network, Hence a connection cannot be established. When this occurs the endpoint can request it’s public IP address from a STUN server. This publicly reachable IP can be used by other endpoints to establish a connection. But this case fails when endpoints are under symmetric NAT, which happens in most of the practical cases.This is where a TURN server comes into picture.
  • TURN (Traversal Using Relays Around NATs):
    TURN server as the name suggests is used as a relay server or an intermediate server to exchange data. When any endpoint under Symmetric NAT can contact a TURN server which is on the public internet to establish a connection the endpoint is then called a TURN client. The disadvantage of using a TURN server is that it is required throughout the whole time span of the session unlike STUN server which no longer needed after the connection is established. Therefore in ICE technique STUN is used as default.

RTP/RTCP :

Real-time Transport Protocol / Real-time Transport Control Protocol

this protocol is designed to handle real-time traffic (like audio and video) of the Internet. RTP must be used with UDP. It does not have any delivery mechanism like multicasting or port numbers. RTP supports different formats of files like MPEG and MJPEG. It is very sensitive to packet delays and less sensitive to packet loss

SCTP :

Stands for Stream Control Transmission Protocol.

It is a connection- oriented protocol in computer networks which provides a full-duplex association i.e., transmitting multiple streams of data between two end points at the same time that have established a connection in network. It is sometimes referred to as next generation TCP or TCPng, SCTP makes it easier to support telephonic conversation on Internet. A telephonic conversation requires transmitting of voice along with other data at the same time on both ends, SCTP protocol makes it easier to establish reliable connection.

SCTP is also intended to make it easier to establish connection over wireless network and managing transmission of multimedia data. SCTP is a standard protocol (RFC 2960) and is developed by Internet Engineering Task Force (IETF).

DTLS/SRTP :

wcdn.3cx.com

Datagram Transport Layer Security (DTLS) is a communications protocol designed to protect data privacy and preventing eavesdropping and tampering. It is based on the Transport Layer Security (TLS) protocol, which is a protocol that provides security to computer-based communications networks. The main difference between DTSL and TLS is that DTLS uses UDP and TLS uses TCP. It is used across web browsing, mail, instant messaging and VoIP. DTSL is one of the security protocols used for WebRTC technology along with SRTP.

What mistakes i made && hope you Don’t make these !!!

Be sure to review these to see if there’s anything you’re doin’ wrong:

  1. Using an outdated signalling server (from github)
  2. Mis-configuring NAT traversal
  3. Testing locally
  4. Not using adapter.js
  5. Forgetting to take care of security
  6. Assuming you can outsource it all
  7. Diving into the code before grokking WebRTC

Summary :

I hope I’ve given you the clearest/basic picture about WebRTC and, how to start using it, I’ll try to provide a functional prototype and more coding examples with upcoming post with video calling project. Thanks for reading this far and, if there are any questions don’t hesitate to ask .

--

--