Load testing Azure IoT solutions with Pulumi

Christian Eder
5 min readAug 12, 2019

--

This article shows how you can leverage Pulumi (a tool to implement cloud infrastructure as code) and Azure Functions to set up a very simple and cheap, yet flexible load testing infrastructure for IoT solutions based on the Azure IoT platform.

We’ll start with some background on building Azure IoT solutions and a motivation why performing load tests should be considered. If you are only interested in the how and not the why, feel free to skip the first chapter.

Designing Azure IoT solutions

If you choose to develop your IoT solution based on the Microsoft Azure IoT platform, you are likely to have a set of requirements unique to the needs of your company. If you have a somewhat simpler sensor-to-dashboard scenario at hand, you may choose another offering such as Azure IoT Central, following a Software-as-a-Service (SaaS) approach with less flexibility, but more out-of-the-box features compared to the Platform-as-a-Service (PaaS) approach that the Azure IoT platform offers.

These requirements will continue to drive your architectural design and technology choices. Azure gives you great flexibility in terms of how you can connect different services to build an IoT solution. This flexibility forces you to make a lot of decisions in areas like

  • Compute — highly complex business requirements and / or a very high expected load might make you favour some of the more sophisticated compute services that Azure offers to implement and deploy your message processing logic, such as the Azure Kubernetes Service or Azure Service Fabric. Solutions with less demanding workloads might be better off using a simpler compute service such as Azure Functions.
  • Storage — since the various storage services Azure offers differ hugely in terms of cost, scalability and queryability, it is usually a good idea to think about using different storage services for different types of data.
  • Data Analytics — depending on your solution’s needs for “cold path” batch analytics you might either end up implementing a complex custom data analytics solution involving services such as Data Lake Analytics, Data Factory and SQL Data Warehouse, or a simpler approach involving only “hot path” analytics using Stream Analytics to get the job done.

You will end up with an IoT solution that is tailored to your needs, but also comes with a very unique set of properties regarding scalability and reliability. Performing load and stress testing allows you to verify if those properties satisfy your requirements.

Creating a basic Azure IoT solution using Pulumi

We start to create a very basic IoT solution setup which we are going to load test:

  • An Azure IoT Hub as the cloud-side gateway where IoT devices send telemetry messages to
  • An Azure Function reading those messages from the IoT Hub and performing minor data transformations before persisting them into…
  • An Azure Storage Table for long term storage of raw telemetry data

Including the simulated devices that we create later in this article, the solutions architecture looks like this:

Using Pulumi, the code we implement to create all the required Azure resources and deploy the code running in the Azure Function to process messages looks like this:

Building the load test using Pulumi

Now that we set up the simplest of IoT solutions, we want send large amounts of data to it to see how our implementation reacts to that load. We are going to set up an Azure Function app that is able to simulate hundreds of devices using Microsofts IoT Hub device client package. We will implement this by extending the Pulumi program above to

  • create some simulated devices in the IoT Hub
  • get the device credentials for each device
  • set up a timer triggered Function per simulated device
  • deploy that Function with code to send simulated telemetry data to the IoT Hub

The following code snippet uses Microsoft’s Azure IoT Hub management package to create devices and retrieve their connection credentials:

We will use the code above to create a set of simulated devices, and deploy the Azure Functions performing the simulation right after the IoT Hub itself has been deployed, by using the following code snippet:

Some details about this code are worth noting since they are unique to the way Pulumi allows you write infrastructure code using Typescript.

  • The call to pulumi.all in line 7 shows how you can wait for the first parts of your infrastructure to be created, and use properties of this infrastructure after that
  • The call to our custom getDevices function in line 9 shows how you can await calls to any API required and use the results to continue creating infrastructure resources
  • The call to devices.map in line 14 shows how you can take advantage of standard TypeScript language features in order to create and customize your cloud infrastructure — in our case, create & deploy a timer triggered Azure Function per device connection string available in the devices array

The resulting Pulumi program will — after creating and deploying the tested IoT solution itself — create 100 Azure Functions each simulating a single device, creating a combined load of ~3.000 messages per minute. Since we are using a Function App to simulate load, we can start and stop the load test at any time by starting or stopping the Function App.

Using data captured by the Iot Hub and possibly Application insights, we can build a dashboard visualizing the load test results:

Taking it one step further

In order to increase the load further, you can easily take advantage of the fact that Pulumi is using a real programming language to describe cloud infrastructure:

  • Increase the number of device-simulating Functions per Function App
  • Refactor the code to simulate more than a single device in parallel per Function
  • Refactor the code to create more than just one Function App, each simulating hundreds of devices

In order to evaluate the results of a load test, we can take advantage of a range of Azure features

  • Azure IoT Hub adds the enqueued time as a system property to all messages before routing them to other services such as our Azure Function processing the messages
  • Azure Table Storage automatically adds a timestamp property whenever it stores an entity
  • The Azure IoT Hub feature implementing distributed tracing is currently in preview

All of the code shown above is available here — you can clone and run it (assuming you have Git, Node.js, the Azure CLI and Pulumi installed and set up) via

--

--

Christian Eder

Software architect @zuehlke_group, passionate about automation, infrastructure & architecture as code