Understanding VoIP: A beginner’s guide to internet telephony using an analogy

Alicia Cawley
Weave Lab
Published in
8 min readJan 13, 2021

While the concept of VoIP, or Voice over Internet Protocol, is straightforward (i.e.. “I can make a call using the internet!”), trying to dig deeper into how it works can lead to a world full of jargon and acronyms. It can be hard to follow and truly understand. This article tries to break it down in layman terms and explore why using VoIP can be so powerful for businesses.

How does it actually work?

In the articles I’ve read about VoIP, I’ve yet to find a great analogy to explain all of the ins and outs. Perhaps a great analogy doesn’t exist, but since they can be so helpful in explaining difficult topics, I’ve decided to try to create one, but it requires a bit of imagination.

The analogy

Photo by Nikola Markelov on Unsplash

You live in a house inside a city. Instead of using phones or other means to communicate with other houses and cities, cars are deployed to drive messages. Cities supply cars to relay the messages. Some cities have cars that are old and slow, while others have speedy sport cars. Instead of one car running the messages back and forth, the city sends several cars that each contain parts of the message (Yes, I know that sounds inefficient. Stick with me).

Historically the cars were required to drive a specific path to get to each city, but now each car is equipped with a mapping function that looks at current traffic conditions and directs the car to the fastest route, even if sometimes that means driving more miles.

When someone, let’s call him Sam, wants to communicate with someone in another town, let’s call her Sally, he deploys a car with a request to communicate. If the other person also wants to communicate then they will send a car back.

Sometimes Sally doesn’t want to talk to anyone. She closes the gates to her house so Sam’s car can’t even deliver the original message.

But for this analogy, let’s say that Sally is ready to talk to Sam. She answered the door when the first message was sent. Once communication is established, both Sam and Sally need to decide how they each want to send their messages to each other. They have three options. First, instruct the cars to drive to the house and leave the message package on the doorstep. Second, have them ring the doorbell and make sure his message is received. The last option is to ring the doorbell and ask for a security phrase to make sure the message is going to the right person. The first option is the fastest. The second provides assurance the message was received. The third provides the assurance that the message was delivered accurately and securely.

How does that explain VoIP?

In the analogy a city represents a company and each house is a device — this could be a deskphone or a softphone. A softphone is a way to make phone calls using a software app.

Establishing the call
The process to start, continue, and eventually end a call is governed by something called a session initiation protocol, or SIP. It provides the universal standard and rules that calls need to follow in order for them to work. This includes the order of the communication and the format of the messages. It’s similar to other internet protocols, like Hypertext Transfer Protocol (HTTP).

In the analogy all of the cars are part of this protocol, but the first one is different from the ones that maintain the call. It’s trying to initiate the call, so it’s the one that will make a device ring. It also contains information about the sender, including their location, availability, and capabilities.

That user data is often referred to as a SIP Profile. This is different from the protocol. A SIP Profile references a device and certain information that is attached to that device, like whether or not a device has its “do not disturb” setting turned on. If it does, it’s like closing the gates at the house. The phone won’t ring even if someone is trying to call.

The first message that initiates the call also includes how the following messages or cars should be delivered. This is called the signaling protocol. There are three options: UDP, TCP, and TLS.

UDP, or user datagram protocol, is like the car delivering the package to the door, without waiting for a response. It makes the delivery faster, but sometimes leads to some of the packets getting lost or being delivered in the wrong order. To the person on the phone, this might sound like the audio is a little choppy. UDP is sometimes compared to broadcast radio. You’ll keep sending through the data without wanting any kind of response.

TCP is like needing someone to answer the door so you can make sure the message is delivered. Photo by Andrea Piacquadio from Pexels

TCP, or transmission control protocol, is like having the car ring the doorbell to deliver the message. Each packet of audio is sent one at a time and confirmation needs to be sent back before the next packet will be sent. If a confirmation isn’t sent, the packet will be sent again. This can make TCP a bit slower, but it ensures the accuracy of what’s being sent. In this way, TCP is more like a 2-way radio; data is being sent and it will wait for a response before sending more.

TLS, or transport layer security, is TCP with an extra layer of security on top. This is like having the car ring the doorbell and have the person answering provide a security phrase before the message would be shared. This is usually preferred since it helps keep phone calls smooth and secure, especially when HIPAA information is being exchanged.

Communicating during the call
Once a call is established, multiple cars are sent back and forth. Each one contains a packet of data — usually about 10–30 milliseconds of audio, or less than a syllable. Even though that seems small, when the audio is captured, it’s converted into a digital format and compressed. When the message gets to its destination, the process is reversed; it’s decompressed and converted back to audio. All of this work is done by a codec. Different codecs provide different sound quality and bandwidth requirements.

Other considerations
Since the calls are happening over the internet, the available bandwidth and speed affects the quality of the call. If a company has a small amount of bandwidth, it’s like driving an old, slow, clunker. The messages should get to their destination, but it’s going to take some time, which can make the call feel choppy. Sometimes the cars break down and the call is dropped. But if a company has a lot of bandwidth and speed, it’s like driving a sports car. The audio packets are going to smoothly sail to their destination.

Network audits can identify if the internet is the cause of call issues. Another useful tool is looking at the MOS, or mean opinion score, for recent calls. MOS is a programmatic algorithm that looks at a call’s audio codec, dropped packets, jitter, and latency and then assigns the call a score between 1.0 and 5.0. A score on the low end, a 1 or 2, would indicate a terrible call where you couldn’t tell what the person was saying. A score on the high end, a 5, is perfect and likely unattainable with today’s technology. Cell phone quality is considered 3.6, but having a score in the 4 range is ideal. If calls are consistently below a 3.0, the company is likely driving a slow, clunker car.

Another reason why call quality can sometimes be bad is if the call is being routed through a physical data center. This can slow down the process if the data center is far away from the company, is getting a lot of traffic, or if something happens in the data center that disrupts the flow. When a call is being routed this way, it’s like telling the car that it always has to take the same path, even if it isn’t the fastest. In the analogy I mentioned that all of the cars were being equipped with a way to know about current traffic conditions and the fastest route. This happens when calls are routed dynamically through the cloud; the system identifies the fastest, lowest latency route and automatically takes it. This removes the risk of a physical data center and always provides the best call quality possible.

Why is it better than a landline?

Even if your landline is newer than this, VoIP has a lot of benefits. Photo by Eckhard Hoehmann on Unsplash

VoIP’s reliance on the internet might make it feel risky and unappealing. After all, we’ve all experienced internet outages and businesses can’t afford to have their phones not work.

One of the biggest reasons why companies switch to VoIP is the cost savings. Since it’s easier to send data to distant locations using the internet than it is laying and using copper wiring, VoIP reduces the cost for long-distance communications and is cheaper when you want to add another line.

Another benefit of VoIP is integrating other services and data into your phone system. For Weave customers, we offer the ability to integrate with a number of patient management systems (PMS). This allows Weave to tell the office which patient is calling, when their next appointment is, any money owed, and any other notes the office has added. This helps the office provide a much more personalized experience than they would be able to do otherwise.

Because Weave has the data about the calls, it also helps provide insights like when they received the most calls during the day, how many are answered versus missed, how many are from patients, who’s answering, and what phone numbers are getting the most calls. This helps the office to make informed staffing decisions so fewer calls are missed.

Even in instances of an internet outage, VoIP allows call forwarding to be easily and quickly turned on. This will divert calls to a landline or to a cell phone so a company won’t miss any calls.

Final thoughts

VoIP is powerful, but it can also be confusing to grasp especially with all of the jargon and technical phrases. The hope of this article is that by breaking down the language with an analogy about cities communicating via sending cars, it has led to understanding the basics. Viewing the benefits with this knowledge in hand helps explain how VoIP can provide lower costs, pair data with calls, and help businesses of all sizes be more efficient and personal while communicating with their customers.

--

--