Olav Nedrelid
Jul 27, 2018 · 7 min read

No matter what or where the problem is, to an end user the experience is simply “why doesn’t this work?”.

It is left to the service provider to perform the complex analysis and diagnostics that involve every step of the way from the service app to the content server and back.

Domos Algorithm is the first to fully automate and optimise this daunting task. The key components will optimise the Wi-Fi network, as this is by far the most common reason for end user frustration. But it will also monitor and address issues related to the other service layers — end user app, the consumer device, the fixed access network and the content server.

This document describes the key components of the algorithm and how it address Quality of Experience (QoE) of any technology and internet service consumed in the home. The diagram below show how they are applied to each of the 5 layers:

Why AI and Machine Learning?

Every home is different.

Every consumer device is different.

Every app is different.

Every network is different

Everything changes all the time.

Every issue is blamed on the ISP.

Only a machine could possibly make sense of it all.

Meet the machine — Dom.

Algorithm Components

Device Taxonomy

Each end-user device has different needs for service and maintenance. We use various indicators to develop a device model taxonomy — kind of a “fingerprint” — that uniquely identifies the make and model of each device. When we know the device model we can help the manufacturer or service provider create better user experiences for their end customers, and we can identify and correct for device specific misbehaviour on the network.

Strong device taxonomy prove particularly useful for our Rate Steering algorithm, which police anti-social device behaviour on the Wi-Fi network.

Traffic Identification

A 4k video stream will consume huge volumes of data. A sensor might need to send only a single packet.

At the same time the video stream works fine even if there is some latency, while the sensor packet can be super sensitive to any delay.

It would be best for everyone if the sensor simply could “jump the queue” and be sent ahead of the video stream packets. The video stream wouldn’t even notice.

This is why Traffic Identification has massive performance improvement effect. We are already working with a top Nordic ISP on a prototype that separate gaming and video streaming traffic in real-time from a standard gaming console.

In-Home Optimisation

The In-Home Optimisation runs 24/7 for all connected gateways and networks to ensure they are all optimized and perform optimally in their given environment.

The process is best explained by the following:

The objective quality experienced by the end user is continuously monitored and measured (1). At regular intervals the full data set is collected and sent to the prediction algorithm (2). The algorithm predict what would be the impact of making adjustments through the available levers (3). If predictions show significant improvement, one or more of the levers is applied. The result (4) is then measured in detail over 24 hours and the process is repeated (5) daily.

Domos in-Home Optimisation is made possible by two key capabilities:

  1. We can continuously measure the actual quality by monitoring each data transfer rate on a second-by-second basis.
  2. We can uniquely train and improve the prediction model from the continuous feedback loop on all the routers connected to our cloud (currently >100K and growing)

Area optimisation

In multi-dwelling urban environments there is a shortage of radio frequency. We call the time we have access to transmit data over the radio for “Airtime”. When there is no interference from other transmitters on the same network, you can have access to 100% of the airtime. But in apartment buildings there will be many networks sharing the same radio frequency, and each network will typically only access 10%-20% of the airtime in busy hours. This is what our Area Optimisation algorithm address.

The algorithm can “magically” create more airtime to be shared by everyone by doing two main things:

  1. Optimise channel planning using advanced genetic algorithms
  2. Reducing interference by optimising each routers signal “volume” , also known as transmit power. It is an iterative process that can be described as teaching the routers to use their inside voice while making sure they can still be heard loud and clear by all clients.

When deployed in real buildings we consistently see improvements of 100% — 200% of actual frequency, depending on how many gateways we can control.

I.e. we more than double the Wi-Fi capacity in the entire building.

Real improvements to airtime after deploying to multi-dwelling building with access to less than 50% of total gateways

Rate steering

We have abundance of data showing that the most common QoE issue on Wi-Fi network is misconfigured rate adaptation algorithms, especially for the traffic transmitted back from the endless variety of end-user devices.

The strong Device Taxonomy capability is therefore key to also understanding and correcting for device behaviour that is impact entire Wi-Fi network performance.

To accomplish this we developed a totally new algorithm called “Rate Steering”. Essentially this algorithm will correct anti-social behaviour patterns and ensure civility among the Wi-Fi client population. Kind of like a gentle Wi-Fi police.

And policing works. Early real life deployments show Rate Steering consistently increase overall Wi-Fi capacity in a home by more than 100%, i.e. more than doubling of the actual bandwidth or throughput. This means huge improvements to overall QoE, as it will radically reduce quality interruptions.

Rate steering effect on airtime usage from ~50,000 household gateways with improvement 100% to 200% depending on how aggressive approach in removing unnecessary use of low rates.

Troubleshooting

Troubleshooting is the process of undertaking a live, real-time diagnostics addressing all layers in the stack, and returning a live dashboard of each test that is run and the corresponding results.

Troubleshoot will (in the next version) include these tests:

  1. Wi-Fi environment and performance (as today)
  2. Check response from the connected devices (i.e. ping test over Wi-Fi)
  3. Check that the router(s) and other in-home networking equipment is ok
  4. Check capacity and response time on the internet line provided by the ISP
  5. Check that the currently used as well as most popular cloud services are responding

The tests are done in real time and benchmarked to historical data. This way the end user will quickly understand what is causing any QoE issue perceived.

Proactive monitoring

Proactive monitoring is to raise alarms when QoE issues arise in layers that are not directly controlled by the algorithm.

For the Fixed WAN network the algorithm can trigger tests on demand or on event (i.e. when an issue is suspected). It will notify the ISP appropriately as configured in scope and severity levels. As QoE in general is monitored on the router every second, WAN side performance issue alarms can be raised at the appropriate level, with full insight, in a matter of seconds.

Through the Traffic Identification algorithm we have insight into what services are consumed by the user at every moment. Every time the user initiate a new service session — say start streaming a netflix movie — the algorithm can be configured to do a performance test to the netflix content servers and match with historical data. This way the end user can also be notified that there may be issues on the server side.

The proactive monitoring also covers the router itself, as well as key indicators on the Wi-Fi network.

Network slicing

Network slicing is the new buzzword in mobile and 5G industry. The promise is to virtualise network resources from end to end, enabling both fixed and mobile network providers can sell differentiated connectivity services also over the IP layer.

A fully virtualized end to end network ready for slicing is still years away. Meanwhile, our Traffic Identification capabilities enable us to provide a taste of what the future will bring in that we can dynamically prioritise latency-sensitive application traffic over greedy throughput gobblers like streaming and video services. The streaming services will not notice any reduced QoE, while the latency-sensitive traffic will see huge improvements.

We are already doing a prototype of this with one of the top Nordic ISPs, for their dedicated gaming offering. We separate gaming traffic from streaming to the same device, and by that we are saving lives. Because in gaming, latency kills.

System Architecture

The algorithm is available through open southbound and northbound APIs. Our cloud is by default hosted on Microsoft Azure, but can be ported to other cloud services or be hosted on-premise.

We offer OEMs and ISPs reference code and designs that implement our APIs from both southbound gateways and northbound business applications.

Domos - Creating the Home IT Assistant

Stories on how we work to develop the worlds first home IT assistant named Dom

Olav Nedrelid

Written by

Techie, data scientist and business exec at Domos. Did cybernetics in the 90s, now desperately trying to keep up. Love building product & platforms.

Domos - Creating the Home IT Assistant

Stories on how we work to develop the worlds first home IT assistant named Dom

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade