How to Benchmark a Database

Page against the machine
4 min readDec 6, 2021

--

This article will lead you to a couple of examples of correctly benchmarking database speed. Before that, I’d like to explain why, if you have to ask how to measure database speed, you probably shouldn’t try because you would be wasting your time.

When I was much younger, we used to buy and play a trading card game called Top Trumps. Each card in the deck would describe something, notionally in the same category, along with half a dozen metrics. For example, the “Fast cars” deck might have Top Speed, 0–60 time, Horsepower, Torque, Price and number of Seats.

Each round one player choses a metric, then the highest value wins both cards as that one was deemed “better”.

I’m going to use some hypothetical cards from this deck to illustrate database benchmarks. I will let you, dear reader, decide which car best describes which database but when I mention a car, think of a database.

Suppose we want to benchmark one or more of the following “fast cars”.

We have a range of options for our fast car, all with impressive published specifications. So we come to our two questions, how fast is any given car in reality? and how do two or more compare to each other?

We are also asking the question because we are considering using one of them to drive and not for idle curiosity or to sit in the garage.

Here is the first problem you face picking the fastest car. You are not a professional racing driver so the question you are answering is not how fast will it go, but how fast can you drive it. In a few environments the F1 car will be fastest but a normal driver will not get an F1 car to move at all, never mind quickly. For databases, the fact you want to know how to benchmark one suggests you are not a professional database benchmarker either.

The next is about the environment. If we compare our cars on a straight-line, quarter-mile drag strip, we might have a clear winner. Driving round a racing circuit we might get a slightly different answer. If our route is over a wide variety of terrain, and let’s be honest many databases have to do a bunch of different things, then those designed for smooth tarmac may not make it at any speed where the Impreza will still be fast. Resilience matters here too , Imprezas have been known to win rallys with one wheel missing!

Let’s be realistic though, unless your use case is very specialised (they seldom are) you want to just use your car like everyone else, i.e. drive on normal roads with typical traffic. So is the question better phrased as which of these cars will be fastest for your daily commute?

The answer to that is, for those cars that are road legal, then they will all be plenty fast enough, far faster than you will likely ever drive them and other factors, like: how well they handle bad conditions, can they carry enough people, are they easy and ideally fun to drive? are what you need to consider, not top speed.

To summarise, measuring how fast a car drives is meaningful only as long as you know the route you plan to drive and your ability to drive it. Mostly speed will be about the same regardless of which car for a normal drive, in which case comfort, safety and even fun should be considered.

The same applies to databases, speed depends very much on your use case. A drag race is probably not what you want to be testing. The performance you attain also depends on your skill and how easy it is to ‘drive’ so think really hard about what you are measuring, why and if it’s easier to just watch one being driven by someone experienced.

Now, that’s not to say you can’t benchmark databases. Mark Callaghan over at http://smalldatum.blogspot.com/ does a fantastic job of database benchmarking. He’s an expert in all things database and he benchmarks MySQL, against previous versions of MySQL because as a wise person once said, “The value of a benchmark is inversely proportional to the number of different databases tested.”

If you want to compare the performance of two MongoDB configurations, perhaps to compare a self-managed version to a MongoDB managed service like Atlas then I know of nothing better than POCDriver.

Whatever database you choose will almost certainly be fast enough for your use case as long as you use and size it correctly; If you have a big data use case, it probably needs to support horizontal scaling too. Instead of worrying about measuring speed, look at developer productivity, resilience, and ease of use and yes, fun if you want to measure what really matters.

--

--

Page against the machine

John Page is a Document database veteran, who after 18 years building full-stack document database technologies for the Intelligence community joined MongoDB.