When Linear Scaling is Too Slow — Build for fast hardware.

Paige Roberts
3 min readFeb 9, 2024

--

Agenda slide: 1. What is the worst strategy to get performance at scale? 2. Useful strategies for achieving high performance at extreme scale. 3. A practical example of these strategies in use. 4. Takeaways, next steps, and Q and A.
Agenda for Data Day Texas presentation on strategies for performance at extreme scale.

(Skip the first paragraph if you’ve already read the earlier posts in this series. Jump down to “Start”)

This is part 5 of my series on strategies for high performance data processing at extreme scale based on a talk I did at Data Day Texas 2024. The previous 4 posts talked about the one strategy that should be your last resort, but is usually the first and sometimes the only strategy in software designed for high scale data processing. The second post launched into the first good strategy, workload isolation. The third post focused on building a foundation that starts with true linear scaling, a shared nothing architecture. The fourth post talked about changing the way you store data to go beyond linear scaling to reverse linear scaling with aggressive data compression.

The main question the talk sought to answer was: What strategies do cutting edge database technologies use to get eye-popping performance at petabyte scale?

Start

All of the strategies I’ve talked about in previous posts have been spinning disk focused, mainly strategies that Vertica, an analytical database built for commodity hardware uses. The normal query response times for Vertica are in the hundreds of milliseconds to seconds range even at high scale. This is good for a standard analytical database, but for many requirements, it’s far too slow.

Especially high-volume transactional requirements that systems like Aerospike routinely meet. Anything that takes longer than a microsecond for many use cases is too slow. The strategies I discussed earlier involve optimizing spinning disk I/O, but in the end, spinning disk can only go so fast.

A few years back, the idea that solid-state drives (SSD) aka flash drives could be used in a whole data center was still pretty radical. But Aerospike needed a way to take transactional databases to the next level and flash drives held the key. So, they built an entire database on top of this new technology.

Build your software to run on the fastest available hardware.

They didn’t just design Aerospike to run well on that hardware, they focused on taking advantage of every ounce of speed that hardware could give them. By jumping on a hardware breakthrough early, they were able to take full advantage of that new level of speed. By working with the SSD data center vendors, they were able to use ever advantage the hardware provided.

Back in the day, SSD systems were pretty pricey, but over time, the cost has come down to the point where the price point is competitive. And SSD data centers have gone from a crazy concept to a common occurrence. You can spin up SSD-based cloud instances on public cloud providers now. My laptop has an SSD hard drive.

To squeeze even higher levels of performance out, Aerospike bypassed a lot of the standard software systems that were mostly for the convenience of humans. Their software goes straight to the hardware, right past all the operating and file system stuff.

Graphic comparing Aerospike’s habit of going straight to the hardware layer with most database’s habit of going through the operating system’s file system, page cache, and block interface before getting to the hardware.
Talk directly to the hardware vs talking to the operating system, file system, page cache, and block interface before you get to the hardware.

It’s another major strategy worthy of its own post:

Integrate tightly with the hardware with nothing between your application and bare metal.

By building the software to take full advantage of faster hardware, unparalleled speed is not just possible, but becomes commonplace. There’s a lot more to that, of course, but I’ll dive deeper into some other smart optimizations in future posts.

What did all this hardware optimization give them? Less than one microsecond SLA for both writes and queries is normal, even at petabyte scale for an Aerospike database.

But as most folks know, there’s a hierarchy of speed in hardware, with spinning disk at the slow bottom. SSD is orders of magnitude faster. But even faster than that is RAM. So, what about doing everything in-memory as a strategy?

Read on to the next post as we dive into the advantages, disadvantages and how they get balanced in practice.

--

--

Paige Roberts

27 yrs in data mgmt: engineer, trainer, PM, PMM, consultant. Co-Author of O’Reilly’s : "Accelerate Machine Learning" “97 Things Every Data Engineer Should Know”