Open feature request: CallSession API

A missing piece for WebRTC based mobile PWAs. Written by @kartoffelmos

These days, we are revamping the mobile version of appear.in, and promoting it to being a first class client. In that regard, I’m starting to see some bits and pieces that would be real nice to have browsers implement.

WebRTC has been available to modern browsers for quite a few years now, making it possible to have video conferences directly in the browsers. No need to install an application, no need to install a plugin. Just point your browser to a web site and start talking with colleagues, friends and family. This even works in most browsers on Android (and quite soon on iOS) (fun-fact: we even made sure appear.in worked on Firefox OS back when that was still a thing). So while WebRTC based web services demonstrably work excellent on mobile devices in browsers today, and have for several years already, there is something missing for the experience to properly rival the native app counterparts.

When in a call on your phone, be it a “regular” phone call or through an app like Skype, you’ll get some sort of OS-level notification that a call is happening in the background. This is something that is sorely lacking when deploying WebRTC based web services, so I want to publicly request that we fix this.

What we have today

Below is an image of what you’ll see when granting Chrome video and microphone access to appear.in, and are in a video conversation.

The situation today

As you can see, it is pretty bare boned. While better than nothing, it is quite lacking in functionality. Biggest omission I think is that that there is no indication that the camera stream is being transmitted to others. In fact this is the same kind of notification you’ll get if you have created a simple camera application.

It is easy to imagine a better way. So I did.

Inspiration: MediaSession

Inspired by the fairly new MediaSession API, where developers can customize the appearance of media playing in the background on the phone — letting users interact with the page even when not viewing the page — I have created a sketch of how this could be solved.

Image of how the MediaSession looks like.

The MediaSession is persistent notification that appears if and only if a piece of media is being played on the phone. This notification grants users control over the medium even when the browser is not open and center. This notification has a few hooks that developers can use to style it, such as setting album art, hooking buttons up to back and forward scrubbing controls, etc. Here’s a great writeup of it in action.

CallSession API

Artist’s impression of how the notification could look like for appear.in

Introducing CallSession. An customisable persistent notification, that’s displayed if and only if an active WebRTC PeerConnection is detected.

As for the content of the notification, I have not put in too much thought — just what I (personally) consider to be the bare minimum for a video call application, some spots to put in useful data. What you see here is basically mirroring the MediaSession, but with a different set of buttons.

What I’m imagining is a header text (You are talking in…), a subtext (Duration 36:46), an image (the appear.in logo), and a set of developer defined buttons.

It is also easy to imagine some of these bits being on by default when the browser detects a WebRTC connection, the “Hang up button” being a prime example — giving users an extra ability to quickly sever an ongoing call.

Not only for multi party video conversations

For a 1-on-1 call, it could be useful to have some kind of avatar image of the one you are conversing with. Maybe your service does not support video, so the ability to disable the video stream doesn’t make sense.

It could be styled differently for 1-on-1 audio-only conversations

Security

As mentioned above, there is nothing telling the user that the media stream is being broadcasted, only that it being accessed by a web page. It is probably a healthy paranoia to expect that a media stream always is being sent across the wire. But the browser can do the users a little extra favor by telling them when it is very much known. And maybe even put up a easy-to-access “Hang up” button for them to use.


And if, by some chance, a Googler or Mozillian is reading this, can we please have the MediaSession API available on desktop, pretty please? I just really want to control my podcasts from an OS level widget.

Edit 4 August 2017: Added a few words about the persistence of the notification, as I realized it wasn’t clear enough. Basically, like how the MediaSession notification is only active if and only if a piece of media is being played, this CallSession notification would only be active if and only if an active WebRTC peer connection is currently active.