Stressing applications with Chaos Platform for Azure

Bugra Derre
Proofdock
Published in
3 min readMar 3, 2021

Would you wear a pacemaker that you have created yourself? Probably you would put the pacemaker to the acid test, before you wear it.

Photo by Nick Fewings on Unsplash

Pacemakers ensure a certain degree of reliability. Reliabity plays a major role in user acceptance. User acceptance and criticality of many software applications give a reason to improve reliability in software too. Smart Home, intelligent building automation, electric vehicle charging stations, and many others experience adverse situations every day.

We, from Proofdock, think that it is better to simulate those adverse situations in a self-determined way within a testing environment rather than being surprised in production on 24th December.

Learn to simulate adverse situations and affect your application’s robustness, safety, capacity, and interoperability. Put applications to the acid test with the Chaos Platform.

Stress CPU

Stress the CPU up to 100%. The duration parameter defines how long the stress test generates high CPU usage.

This type of action is good for verifying that alerting works, and deployed mitigation strategies are behaving as expected, e.g. that scaling rules are spawning further instances to master an increased workload.

Available for Azure virtual machines, availability sets and scale sets.

Network latency

Increase the response time. Configurable parameters are:

  • duration: defines how long the latency lasts.
  • delay: the delay of the response time in milliseconds.
  • jitter: the applied variance of +/- jitter to the delay of the response time in milliseconds.
  • network interface: the interface where the network latency is applied to, e.g. eth0.

This type of action is good for verifying that alerting works, and deployed mitigation strategies are behaving as expected, e.g. that consuming services stay resilient though the network is degregated or lost.

Available for Linux-based Azure virtual machines, availability sets and scale sets.

Fill disk

Fill the disk with random data. Configurable parameters are:

  • duration: defines the lifetime of the file that is created.
  • size: size of the file created on the disk.
  • absolute path: defines where to write the fill file into.

This type of action is good for verifying that alerting works, and deployed mitigation strategies are behaving as expected, e.g. that a scripted runbook empties a specific directory.

Available for Azure virtual machines, availability sets and scale sets.

Burn IO

Simulate heavy disk I/O operations. Configurable parameters are:

  • duration: defines how long the burn lasts.
  • absolute path: defines where to write the stress file into.

This type of action is good for verifying that alerting works, and deployed mitigation strategies are behaving as expected, e.g. that your load balancer redirects traffic to another instance.

Available for Azure virtual machines, availability sets and scale sets.

Stop

Stop a computing instance.

This type of action is good for verifying that alerting works, and the high availability option is working as expected, e.g. your traffic is redirected to an alternative Azure region.

Available for Azure virtual machines, availability sets, scale sets and Azure web apps.

Restart

Restart a computing instance.

This type of action is good for verifying that alerting works, and the high availability option is working as expected, e.g. your traffic is redirected to an alternative Azure region.

Available for Azure virtual machines, availability sets, scale sets and Azure web apps.

Delete

Delete a computing instance. Take care: this is an invasive action. You are not able to recover the computing instance, once it is deleted.

This type of action is good for verifying that alerting works, and the scaling rule is working as expected, e.g. a new instance is spawn when the number of instances has fallen under a threshold.

Available for Azure virtual machines, availability sets, scale sets and Azure web apps.

Now that you know about Chaos Platform, jump in and test your application.

What’s next?

In our next post, we will inform you about application level failure-injection. Learn how you safely inject failures without bringing a whole computing instance down. Be curious. Stay safe.

Who we are

We are Proofdock, a software tech company located in Germany helping engineers to build reliable and robust software. Check out the Chaos Platform for Microsoft Azure DevOps. Visit us on our homepage.

--

--