Scale: A Horror Story
You’ve seen it before: A single application consuming an insane amount of memory or CPU. You’ve also probably heard some variation of the following for years — probably from software vendors with poorly optimized code: hardware is a cheaper fix than software.
It sounds good, but all of a sudden looking at the server with a JVM process running with 100+ GB of RAM that’s consuming huge amounts of CPU. Of course, this needs to be clustered, so what you actually end up with is multiple servers running 100s GB of RAM, all with high CPU requirements.
Commodore once said, “All you will ever need is 64k of memory.” While the idea might have been shortsighted, scaling happening today in the enterprise software space is out of control. On top of all the fraught practices around scaling, consider if disaster recovery is needed. If so, you not only have an ecosystem that is costly and out of control, but also replicated.
The horror of scale is felt by many at any given organization. It’s felt by the engineers who must be added to growing support software, and it’s felt by the company’s IT infrastructure (and budget). Simply put, not enough software vendors and IT companies are looking at their solutions and considering the TCO, Total Cost of Ownership.
What companies should be asking themselves is, “What is this software really costing us, when considering hardware, licenses, support, training, and time?”
Don’t Containers Address the Horror?
At this point, many professionals will say, “This is why we need Docker! Docker will fix the problem.”
The reliance on containerization as a fix-all is a real problem with software vendors today. Docker became popular, so they just started sticking software in Docker, adding more and more layers of tech without actually building software that matched the technology stack being used. Sure, Docker’s amazing, but it’s not your fairy godmother that will make birds sing and magically transform software into something amazing and beautiful. In order to reap the benefits promised by containers, software must be engineered and architected.
A few years ago, there was a Dilbert cartoon that depicted a boss asking Dilbert why moving to the cloud and using containers didn’t solve ALL the software’s problems. Dilbert explained that you can’t just stick the software in a container without architecting the software to be cloud or container-native. The boss was under the impression that he could just say techy words like “Docker” and “Kubernetes” and solve problems. He was wrong.
“Adding technology doesn’t magically solve architecture problems.”
— Jason Tesser
When discussing scale, there are two types we talk about: vertical scale and horizontal scale. Scale is essentially how software adds resources, which must be added to address features like user load, complex logic, larger data, and performance. We can all agree that CPU, memory and I/O need to be added, but how we add those resources is critical to the TCO (Total Cost of Ownership) and the maintainability of the software. TCO and maintainability should be at the core of every IT shop across the globe.
Vertical scale is when resources are added to a server to help the application perform better or hold more data. Horizontal scale is when you add more nodes or installs to the application without adding memory, CPU or I/O to the individual installs.
While you could scale part of the system, like the index, there were just too many issues because the system was built on 30-year-old monolithic thinking. So while it was possible to add more nodes to the cluster and scale parts of the system, at the end of the day, every install still required huge memory and CPU to run. The more nodes added, the worse the scaling horrors became. But the complexity was only the start of the issue. The real horror? The cost of running this type of system.
Ending the Horror
The solution is to understand the software being used. Engage with vendors who are focused on the TCO associated with running large-scale software. While containers aren’t always the answer, understanding the real cost of scaling software is.
For example, if part of the system encounters heavy traffic or load, IT should be able to add more cloud nodes or servers to that discrete part without having to throw more memory and CPU at the problem, blindly praying it will just go away. Don’t just throw the software in Docker. Adding technology doesn’t magically solve architecture problems.
By Jason Tesser