“One Button” Network Validation for Your IoT Device Installations

Published in

Machines talk, we tech.

6 min readJan 17, 2022

IoT traffic in private networks

The Internet of Things (IoT) is everywhere and keeps expanding, and on many occasions it is deployed in the private networks of various corporations.
Deploying & installing an IoT device, especially in the industrial domain where Augury is operating, isn’t an easy task and includes many variables.

In this post, I’ll share with you how Augury, a company that produces a Machine Health IoT solution, approaches the process of making sure our IoT devices communicate from within our customers’ private networks. I will lay out the difficulties, challenges, and how we iteratively improved this process — eventually saving time and money for both sides.

Our IoT devices communicate with the cloud, and while we mostly deploy with a cellular router as a bundle, often the requirement is to use an existing WiFi / Ethernet connection.
These networks aren’t as simple as our home network, as they mostly contain extra security methods and equipment.
I am talking about MAC address access lists, Dot1X implementations, multiple layers of firewalls filled with rules and policies, and more.

It doesn’t matter if you have the most innovative and productive IoT device in the world installed on a customer site if it fails to connect to the cloud due to connectivity issues.
There is nothing more frustrating than having endless calls with the IT/Security teams, spending hours debugging the packets’ flow, revising ACLs, VLANs, and going through the entire stack.

So before we dive into our journey of improving this process, I’ll share some of the challenges we’ve had to overcome in order to set up our devices.

Network Validation Challenges

People & time

You have to schedule a call with the relevant individuals (IT/Security/Field techs), taking into account the different time zones.
This alone may delay the whole integration, especially if there are open issues to resolve, external team dependencies and another session is required.

Equipment

Even if you plan the deployment ahead, shipping a test unit of your IoT device to the customer, you can still expect a few days delay. That’s without taking into account regulations that sometimes make it a nightmare.

Setup and prerequisites

Once the device is there, we spend some time setting up the user for the account, downloading and installing the mobile application used to control it, and finally programming the device with the customer.

Network validation

Everything is in place and configured, now you just need to make sure it communicates well.
If something is wrong, you have to bring in the heavy guns, reach out to IT/Security teams, and go through the appropriate channels.

All the above sums up to whole days spent by us and our customers, just to get the IoT equipment connected to the cloud.

We started to see this pattern over and over as we scale and do business with new customers, asking ourselves — how can we do better?

Solution Research

Discovery

At first, we reached out to some of our customers and asked about the current process, to hear if they feel the same way and our goal is aligned.

Then we asked about multiple ideas we had in mind, to understand what is acceptable and what isn’t.

When we compiled the results, we were certain that it was going to be a Software solution, which would behave very similarly to the SW that runs on our IoT devices.
This way we could ship a piece of executable, and cut the materials and shipping time.

Goals & Expectations

Self-operated
If our users can use this without our assistance and get feedback that they can act on, we hit the nail on the head.
Easy & Self explanatory
The operation should be very simple, much like the speed test app with one button that we described before.
Reliable
We wanted the solution to give accurate results as if our IoT device was there.
Actionable
The output of the tool should be “This service is inaccessible”, and when possible even be more specific — “Port X is inaccessible, but port Y is available”.

Choosing the right platform

We needed to choose the platform where our customers would run those tests.

At first, we thought about a desktop application.
But most of our customers gave us blocks on the security side, and there were use cases where the team at the facility wanted to have mobility while testing different areas. So, it was crossed out.

Then we had an idea — why not create a web app that does the entire thing like a speed test app!
Such a cool idea, but we learned it wasn’t all peachy as some protocols were lacking and we couldn’t actually test everything. Here goes another idea in the bin.

We then said, why not use our mobile app that’s already used to configure that IoT device in the field, to run those tests.
It’s an Android app, already used by our customers, with a familiar interface, so let’s add that functionality there.
And now comes the best part, our humble one-click test application.

Small Steps. Big Impact.

First MVP

We started with a native Android application that just poked the relevant service with a simple TCP session.
As seen in the attached image, the interface contains one button, and the results are colored green and red to indicate success or failure.

While this application worked, it didn’t provide 100% coverage, as an open port doesn’t necessarily imply our service will communicate well.

Take-Two: Improving Coverage & Reliability

We replaced the tests that kept failing with a full-service test.
This means that if one of our services requires MQTT protocol over port 8883, we will initiate a client connection to the service broker and verify they communicate well.

Not only that, but we also overcame some issues with network hiccups, by running each test multiple times and specifying a success threshold.

Insights & Action Items

Per our requirements, we wanted our solution to give a clear idea of what needs to be done to fix the problem.
So we tested alternative ports and hosts combinations that our services accepted and gave recommendations accordingly.

We also reported the results back to ourselves, so we can learn and improve the service we give.
This is what the final solution looks like today:

Closing thoughts

While we didn’t reinvent the wheel here, we did give tremendous value to our customers.
Saving time and money, by replacing a very manual tedious process with the click of a button.
A year after the deployment of this feature, we can see how widely it’s used by our customers and integrated into our processes.
The above process takes 1–2m to output a clear report, and replaces our traditional process that took at least a week.
Moreover, we see our customers using it a few times a week on planned installation sites, without our guidance or intervention.

This feature was born from a pain we identified in our processes, and was pushed from the development team up to the product.

It was a pleasure to work with the customers, understand their needs and talk with IT / Security teams to validate our assumptions and produce the final product!

My small take to you, whatever your role, is to look around you.Look and learn your company and customer processes, ask questions and find where you can bring value and impact. It’s out there, so go grab it!