Smoke, stress, spike, soak, and recovery: 5 essential load test profiles

Published in

locust.cloud

6 min readAug 14, 2024

Welcome back to load testing 101! Last week we covered the goals of load testing and the role of a performance tester. Part 2 is about how you run your load test — the amount of load to apply and how users arrive to your system. As always, if you have any questions, feel free to leave a comment!

This series of articles is based on my nearly 20 years of experience with performance testing in gaming, e-commerce, network infrastructure and finance. I maintain the open source load testing tool Locust, and recently started Locust Cloud so you don’t have to do the heavy lifting of setting up and maintaining load testing infrastructure.

What is a load profile?

A load profile specifies an amount of load, typically expressed in number of concurrent users or requests/user arrivals per second. It can be constant, or vary over time.

Different load profiles and when to use them

Depending your system architecture, the expected behavior of your users and the likelihood of unexpected traffic spikes, different load profiles will more or less important to test. The cost of an outage — lost sales, damage to your company’s reputation, legal costs — will also influence your risk tolerance and whether you need to do all of them.

Setting the bar at 6.25m is overkill for most people, as is doing every type of load profile every time. Photo by Frankie Fouganthin, license CC BY-SA 4.0

And even if you ran all types of load profiles doesn’t mean you have covered everything. A future post will go in-depth on how to build load scenarios with different user behaviors and how to make sure you set the bar appropriately there as well.

Smoke tests: Does your application work at all?

Smoke tests help you to establish baseline performance for your application when it’s under minimal load, enabling you to validate the system on a functional level and take measurements of your “best case” response times.

There’s some debate about where the term “smoke testing” originates from, but it can help to think of turning on a new piece of hardware and being relieved when it doesn’t immediately catch fire.

Apart from creating less noise, many load testing tools allow you to run a single user with more detailed logging or even in a debugger, so starting this way is helpful when validating/fixing your test scripts.

Stress tests: Does your application work with heavy traffic?

At the other end of the scale, stress testing involves subjecting your system to maximum load to understand how it functions with peak traffic. The first part of a stress test is usually gradual increase of load or ramp-up, to avoid hitting a cold system with heavy load out of nowhere. This gives your system a chance to warm up caches, negotiate network connections between systems etc.

Stress tests are often used to discover your system’s limits (sometimes called capacity or breakpoint testing), gradually ramping up the load to the point where your system is “saturated” and response times start flatlining or declining. See part 1 for more on finding your performance limits.

Stress test with a peak concurrency of 100 users and no sleep times, reaching ~70 requests/s

After finding the peak throughput, the most important test is of course to validate ‘high but reasonably expected’ load, based on your requirements. Determining this requirement depends very much on your acceptable level of risk, and there are ways to guesstimate requirements if you don’t know where to start (we’ll explore these in an upcoming post).

Spike tests: Does your application work with sudden heavy traffic?

Spike tests are similar to stress tests, except spike tests have a very short ramp-up time to simulate a sudden spike or surge in traffic (like the Glastonbury Festival or Taylor Swift examples from part 1). Spike tests can show you how your system performs in the event of a rush of traffic. You might not be in the business of selling highly coveted concert tickets, so spike testing may not be the most important for you. Although as one individual learned, anyone can be subject to a DDoS attack, so it’s never a bad idea to be prepared for unprecedented traffic.

A test rapidly going from 1 to 100 concurrent users and back again, in two 60 second bursts, implemented using Locust’s custom load shapes. JMeter-plugins offers a similar feature called Ultimate Thread Group.

Spike tests are extra important if your system uses some form of auto-scaling, to ensure the delay in scaling up is not too big.

Soak tests: Does your application work with extended heavy traffic?

Soak testing or endurance testing validates whether your website or app functions when subjected to its load limit for an extended time. Soak testing is critical for applications that have to keep a lot of data in memory: if your application for some reason needs to keep track of what happened for the last two hours, running a load test for 15 minutes won’t tell you whether it will function properly with two hours of data in memory. Similarly, if the system under test has some form of queuing, it will need to keep queue data in memory reliably, even with a longer period of heavy traffic, making soak testing important.

Some other common issues exposed by longer tests are disk space, database checkpoint times, etc. Be prepared though: occasionally you’ll end up inadvertently testing things you don’t actually want to test, like overflowing your disks in a test environment. Even for extended tests, it’s important to keep an eye on the running test, because if there is an error, you might end up also causing side effects (like massive amounts of error logging leading to overflowing disks etc.).

Not everyone will benefit from soak testing: If you have a webshop for example, it’s not necessarily going to suffer worse performance just because you had a spike in transactions five minutes earlier. There’s also likely to be a seasonality to your customer behavior which you should take into account when considering doing soak testing. If your traffic is focused around business hours, or if you’re a betting site, and you know that people will be placing their bets just before the match (rather than consistently doing so over 24 hours), soak testing with high load for a whole day shouldn’t be your first priority.

Recovery testing: Does your application recover from extreme traffic or a temporary glitch?

This type of testing helps you establish how your system recovers from crashes or failures. You apply more load than the system can handle, resulting in errors and potentially building up queues in places, then lower the load to a level that is known to work. This way you can observe if the system recovers rather than getting stuck in a broken state.

The best type of load test

These are the main load profiles you typically need to run, but the focus you give the different types depends on the specifics of your application. Just remember, the best type of load test is the one that actually gets done. Running at least a rudimentary stress test is a huge improvement on not running any load tests at all.

One way of ensuring that load testing gets done is by making it easier to get started. Locust Cloud gives you access to hosted, easily scalable, and distributed load generation, as well as advanced reporting — all while preserving the flexible “it’s just Python” approach to load test scripting that Locust provides.

If you want to support my work and see more content like this

Clap 20 times for this story
Leave a comment telling me your thoughts, or highlight your favorite part of the story

Thank you!

Lars Holmberg @ locust.cloud / LinkedIn