Why we need SIP for Messaging

Rob Yates
IBM watsonx Assistant
4 min readSep 13, 2020

TL;DR Enterprises are struggling to get multiple vendors messaging tools (e.g. service desks, chat bots, in app notifications) to work together and the equivalent to the SIP standard is needed for the messaging space to foster vendor interoperability.

Background

The SIP Refer specification was released nearly 20 years ago. It allows for many different use cases in the telephony space. Importantly for this article, it allows multiple systems from different software vendors to interoperate when handling a phone call to an enterprise. A typical call flow may look like the following

  1. The customer calls a toll free number and the call is answered by a traditional Interactive Voice Response (IVR) system. We all know what these feel like, press 1 to talk to sales, 2 to talk to support etc.
  2. Once the customer has fought their way through the organizations IVR they may end up waiting for a call center agent, at this point the IVR has handed the customer off to a different system. These systems are called Automatic Call Dispatchers (ACDs) and they queue the caller until an appropriate agent is available.
  3. Eventually the customer gets through to an agent and the customer gets help with part of the query but the customer also needs to talk to a different agent in a different group to finish what they need to do so this agent forwards the call to a different group.
  4. The customer now gets another IVR with different options before finally ending up with a different agent attached to a different ACD.

We can certainly debate how good the customer experience is through all this, but the customer was able to stay on one phone call while being passed between multiple IVRs and ACDs and each of these IVRs and ACDs can be provided by a different vendors. SIP refer likely powered all these hand offs and is capable of passing the conversation context around (e.g. who is on the call) via a common session id.

SIP and SIP Refer is allowing an enterprise to be able to choose best of breed and, more likely, deal with the heterogeneous environments that it faces due to acquisitions, mergers, department budget autonomy etc.

Messaging

So now let’s contrast this with messaging, and when I say messaging I mean real time (ish) chat. The kind that we’re all used to with SMS or apple business chat or facebook messenger or whatsapp. More enterprises are turning to this as a means of supporting their customers and their customers generally prefer this way of interacting as they aren’t stuck waiting on a phone. They can send in a message and maybe get a response right away or maybe they need to wait for a response, but they can go about their daily busy lives while periodically checking in on the conversation.

The technologies handling these messaging conversations are maturing fast and there are bot providers, service desk providers, marketing and sales tools providers all making a play to be the system that handles the “call”. A plethora of chat widgets have sprang up on web sites, enterprises are communicating with their customers over whatsapp and apple business chat and they’re all having some success. The challenge is that they’re having success in pockets and, even though the technology is new, large enterprises have already reached a point of technology fragmentation. Different departments chose different vendors, one vendor is chosen for the service desk, a different one is chosen for sales and a chatbot is bought from yet another provider. None of these providers solutions interoperate with each other in any kind of standard way. Indeed many vendors are looking at deliberately creating walled gardens with proprietary approaches for integration leaving enterprises either rolling their own or having to pick single vendor solutions.

Messaging Payloads

Making matters somewhat harder is the fact that the different messaging front ends that customers interact with (e.g. whatsapp, imessage) all have slightly different apis. While there is similarity there’s no interoperability for example between messengers apis and imessage’s. The payloads are all slightly different. This problem is compounded further by the proprietary chat widgets that are springing up on websites and mobile application’s menus. It’s often the case that the api behind these widgets aren’t even documented, let alone being made available to 3rd parties.

What Is Needed

So what is needed to help enterprises. I have been talking to vendors in this space for many months (as well as some of the channel providers) and I think a few things are clear.

  • This is not a problem to be solved by the channel providers e.g. facebook messenger or apple business chat. While facebook has done a great job with its messenger handover protocol it’s only solving the problem for messenger and it’s doing so in a very proprietary way. It doesn’t help vendors and enterprises support other channels and it does nothing for the proprietary web widgets.
  • Standardizing api payloads isn’t the immediate challenge. While it would be nice for some html like standard to emerge that all the messaging providers utilized and that was adopted by all the widgets it’s not the immediate need for interoperability.
  • The parallels with voice and telephony are everywhere and the community should be looking to both SIP Refer and to codec negotiation as inspiration.

Open source or a standard has a role to play in solving this problem. Over the next few posts I will outline more detailed technical proposals. I am eager to hear from members of the community that share the belief that standards and / or open source is required in this space to provide enterprises the flexibility that they need to provide a modern messaging solution made up of best of breed solutions provided by multiple vendors.

--

--

Rob Yates
IBM watsonx Assistant

Technical lead on the Watson Assistant Engineering team