How to write fast software
Premature optimisation is the root of all evil
TL:DR
Fast software, means fast enough to solve the problem it was meant to solve, as a result:
Only optimise a piece of software to the point where it is fast enough for the business.
To reach that level of performance, apply the following:
- Write your code in a way that prioritises clarity
- Is it fast enough for the business?
- If it is fast enough, good, stop.
- If it is not fast enough, find the bottleneck
- Weigh up your options, then fix the bottleneck
- Go back to “Is it fast enough for the business?”
Key to this approach is not attempting to predict where performance bottlenecks will occur and preemptively fix them, instead:
Deploy your software, then find the real bottleneck.
What does “fast” mean?
Fast is talking about the amount of time it takes for software to do some operation.
Importantly, fast is a relative term, there needs to be some other thing that the code is being fast in comparison to.
Saying that a piece of software is fast in isolation does not make sense, it can execute in a set period of time, like 3 seconds, is that fast? I don’t know, what is the context?
Fast for a webserver response? Not really.
Fast for a batch job? Yeah, probably.
Defining fast is then a case of deciding what the thing is that the piece of software is being fast in comparison to. Most of the time, this will mean comparing the time taken by the software with the expectation from the business.
Fast for a human facing webserver response might be less than 100ms.
Fast for a batch job might be finishing in less than 5 minutes.
Fast for a complex business process with human steps might be less than 48 hours.
The critical thing here, is that “Fast” means “Fast enough to solve the business problem”.
The optimisation loop
Once there is a definition of fast, the following process can be used to achieve the goal of “fast software”.
- Write your code in a way that prioritises clarity
- Is it fast enough for the business?
- If it is fast enough, good, stop.
- If it is not fast enough, find the bottleneck
- Weigh up your options, then fix the bottleneck
- Go back to “Is it fast enough for the business?”
That’s the basic process, so let’s look at each point in a bit more detail.
Write your code in a way that prioritises clarity
I’m a fan of writing code in as simple and clear a way as possible. The first pass of a piece of code I tend to prioritise clarity over speed. This comes from my experience of software usually being more than fast enough, but rarely clear or simple enough for the business.
This step is probably the largest step, it basically covers implementing the entirety of the feature. Everything that comes after this is about analyzing and optimising the code written here.
Is it fast enough for the business?
This is the critical question, not “is it fast enough for you?”, or “is it fast enough for Google / Netflix / Amazon etc.”. The software needs to be fast enough to solve the problem it is trying to solve, beyond that, time spent making it faster is time that could have been spent doing something more useful.
Most of the time the business will not tell you how fast something needs to be. Asking the question “how fast should this be?” is also a dangerous game, because the answer is likely to be “as fast as possible”. Making something as fast as possible is pulling on a string that will pull you down a very deep hole.
So, given that you’re unlikely to get a very clear instruction of how fast something needs to be for the business, how do you know if something is fast enough for the business?
It depends, the easiest way to find out is if the software goes out and nobody complains, then it is probably fine. This is not an ideal situation, but it is the way a lot of software is evaluated to see if it is fast enough. Put it out, it solves the problem, nobody complains about the speed, it’s fast enough, job done.
It is possible to be more proactive in finding out if something is fast enough, if you are writing a website, often users will not tell you if the page loads fast enough or not, they will just leave. This is where using something like Google’s page speed tool can be helpful. They will give you an idea of how fast the page needs to be.
When trying to work out how fast you need to be, it is worth considering the number of users and the time sensitivity of their tasks. As the number of users increases, typically the speed of the software matters more because it affects a lot of people.
- Is it a script being written for one user that doesn’t mind waiting 30 seconds?
- Is it a webserver that is serving a few hundred users?
- Are those users doing something that is time sensitive? — Like talking to a customer on the phone or trying to trying to bid against other people in real time?
- Is it a batch process for a large government agency that don’t mind if it takes a week?
- Is it a hedge fund where event nanosecond of latency costs money?
Finding out how fast something needs to be can be quite difficult, most of the time people won’t bother and will release and see if it is fast enough. If that is the approach, then at least have some form of monitoring that allows you to see how fast you are actually going.
If it is fast enough, good, stop
The goal of this process is to write software that is fast enough, but no faster than necessary. Once the software has achieved the level of fast enough, stop, move on to writing something else that is going to provide value.
I’m not saying that we slow software down deliberately so that it is just barely fast enough, only that if the software ever get to the point where we can stop working on it, that we take that opportunity.
Optimising software for performance is a process that can eat as much time as you want to dedicate to it. The goal is to not dedicate more time to it than is needed, because that is time that could have been spent working on something else. Engineering time is expensive, you are more valuable if you are working on something that brings the business more value.
If it is not fast enough, find the bottleneck
In the situation where the software that has been written is not fast enough to solve the problem that it is intended for, then we are in the realm of performance optimization.
Any piece of software will have a bottleneck somewhere, if it didn’t it would finish in the same instant that it started. The goal here is to find the current largest bottleneck in the piece of software.
Finding bottlenecks can be quite tricky, I’d recommend getting some help, use tools that are designed for the purpose of debugging performance issues. In Java, there are a bunch of tools that hook into a running Java process to give all kinds of useful information. If you are writing some form of webserver, there are tools like NewRelic (other tools are available) that can provide breakdowns of where you are spending the most time in your code.
There are a lot of different places a bottleneck can hide, for example:
- Database — is there a particularly hefty query? Does it have appropriate indexing?
- IO — are you trying to read enormous files from the file system? Is it an SSD or a spinny disc?
- Network — are you contacting a server a long way away? Sending large amounts of data across the network? Making a lot of round trips to some external service?
- Application configuration — how many incoming / outgoing requests can your threadpool handle concurrently? Is the app using recommended production settings? Are you using the values that came as default?
- CPU — are you consistently topping out your CPU at 100%? Does your application have the capability to utilise all the CPUs that have been provided to it?
- Memory — are you running out of memory? Are you filling up one of the memory generations in particular (for garbage collected languages)?
These are some of the very broad range of possible bottlenecks you can have. There are so many possibilities and they can be so complex, that trying to predetermine which of these is going to be bottleneck is an almost impossible task.
Only try and identify the bottleneck once you know you need to find a bottleneck.
This is partly why premature optimisation is the root of all evil, trying to predict and fix bottlenecks will make the code more confusing and the bottleneck that is fixed is probably not going to be the largest bottleneck, rendering the premature fix mostly useless.
Weigh up your options and fix the bottleneck
So the largest bottleneck has been identified, could be a Database issue, Networking, IO, doesn’t matter. The next step is deciding how to fix it.
Once the problem is clearly identified as a bottleneck, there are probably going to be a few different options on how to fix it.
It could involve adding some caching, indexing, changing the infrastructure, updating the code to make fewer requests.
The key here, is that the conversation on how to fix the bottleneck is only occurring once it has been identified as the largest bottleneck. That conversation happens in the wider context of the piece of software where things like changing the infrastructure are possibilities.
Rather than the majority of performance improvement discussions that happen in reviews on things that are perceived bottlenecks being solved in a very local way.
Once a solution has been decided, make the fix, release, monitor the impact and verify that it did actually solve the bottleneck.
Back to “Is it fast enough for the business?”
One bottleneck down, the next step is not to find the next bottleneck. Next, we need to ask if fixing the most recent bottleneck made the application fast enough for the business.
Then we go back round the loop, fixing bottlenecks until we get to a point where the software is fast enough, then stopping at that point.
Optimising people
Not every business problem needs to be solved with technology. If there is some process that involves human steps and takes 48 hours, where waiting for people makes up 46 of those 48 hours and technology makes up the remaining 2 hours. Optimising the technology part of the process is probably not the bottleneck that needs solving.
However, the same process can be applied (roughly speaking), to people and processes. If you don’t know what the process is, then probably a good first step in speeding it up is understanding what the process is then standardising it.
Once the process is defined, ask if it is fast enough, and go round the loop of finding bottlenecks and fixing them until it is fast enough. Fixing bottlenecks in human processes might involve automation, it might just involve giving someone a second screen or better training.
Summary
The key thing I’d like to get across from this story is:
Only optimise a piece of software to the point where it is fast enough for the business.
Anything beyond that point is a waste of your expensive engineering time.
There are always bottlenecks, the bottleneck in your software is probably not something you can predict, so do not try to predict it.
Deploy your software, then find the real bottleneck.
About the author
Hi, I’m Doogal, I’m a Tech Lead that has spent a bunch of years learning software engineering from several very talented people. When I started mentoring engineers I found there was a lot of knowledge that I was taking for granted that I think should be passed on. These stories are my way of trying to pay it forward.