Save the Planet with Better Software — Vertica Analytics for ESG

Paige Roberts
5 min readApr 21, 2023
blue butterfly in a glass globe held on fingertips in a forest

Climate change fears and concern for the environment have escalated in recent years, making Earth Day even more of an important holiday. I write posts about endangered butterflies, live on 40 acres in Texas, and am doing my best to encourage wildlife resurgence in my corner of the world. That’s one of the reasons I have been very proud to speak for Vertica™ by OpenText. Whether you call it Green IT or ESG (Environmental, Social, Governance), every responsible company in every industry is striving to change the way they do things to do their part to save this planet. The compute industry has been focusing on this concern for a long time. Co-location data centers have strived for years to go from giant energy hogs to zero carbon impact.

A lot of people don’t think that software like Vertica, now a part of OpenText Analytics and AI, has any contribution to make. But they’d be wrong. I’m going to make a wild claim. And then I’m going to back it up with some facts.

If every company that does big data analytics now switched to Vertica, we could cut energy usage for analytics globally in half.

How do Vertica customers help the environment?

First, a lot of climate problems can be helped with analytics, which is Vertica’s specialty. Our customers use Vertica with smart grids to conserve energy in cities, smart agriculture to optimize crop yields, route optimization to conserve fuel, and a wide variety of analytical projects that improve the health of our modern world. But this is how our customers use Vertica in their businesses. Vertica powers the analytics, but the companies are the ones putting that power to work to help the planet.

Customers have also been asking Vertica to support ARM processors and data centers that use less energy than their competitors. The current version of Vertica has OEM support for ARM, and general support for ARM in beta. This is already having an impact with the many companies who embed Vertica analytics in their applications, and will undoubtedly have a broader impact in the next major version release.

How does Vertica help the environment?

Supporting more efficient compute hardware is certainly a step in the right direction, but the biggest impact Vertica has on the global environment is that it’s designed to efficiently use any compute power it is given. Since it’s a cluster-based application, Vertica folks talk a lot about operating on data at scale. The thing is, analyzing data at scale isn’t the real trick. Doing it rapidly isn’t even the hardest part, even though you constantly hear about performance.

Analyzing data at scale rapidly and efficiently is the trick.

Efficiency means doing more with less.

This should be the motto of the climate conservation movement. As a society, we waste so much power and resources. In the analytics industry, that same waste is widespread and costs the world every day.

Examples of Vertica ESG in action

The Vertica team did a proof of concept for a company that used Spark to do all their data analysis. I’ve had more than one person tell me that Spark is a standard in the industry, and that’s what they’re using, so they don’t want to even discuss any other options. This company had been of the same mind for many years.

They had 278 Spark EC2 instances on AWS doing the analytics work, and were having trouble meeting their performance requirements. Adding more nodes didn’t help. So, the Vertica team offered to use all their data and do the last six months of their actual analytic workload to prove out the software.

240 little boxes that sorta look like a computer
About 278 nodes (It’s only 240, but I got tired.)

Vertica did the proof of concept and was able to do the same workload with a significant improvement in performance. On only 9 similarly-sized EC2 instances.

9 nodes

Now, the obvious difference that a corporation might immediately think about is that the same job on Spark cost 25 times as much to run as on Vertica. But the big difference from an environmental perspective is that Spark took over 30 times as much energy to run.

Efficient software is essential to any push for reducing energy consumption in business.

A customer of Vertica who embeds the database in their stack to do 5G real-time telecommunications data analytics for several telcos used a classic Hadoop on-premises stack before with Spark as the processor. Each of their many small customers had between 20 and 50 servers to do their analytics. Larger customers had clusters of hundreds of nodes. By using Vertica, they could eliminate multiple applications, vastly simplifying their stack and cut node usage in half — for every single customer. That’s a lot of energy not getting burned this year.

EOITek is another Vertica customer who used to use Spark. They also used about 50 servers for each bank, airline, or credit card company to provide in-database machine learning, AIOps, and business intelligence analytics. With Vertica, they’re now able to do the same thing with 10 servers per customer — an energy savings of 80%.

Now, in case you think Spark is the only problem, have a look at the results from another recent Vertica POC against Snowflake. The customer company, a well-known and popular web application, was running into problems with … well … with being a well-known and popular web application. They had as many as 500 people using their application at once, and that was overloading the concurrency capabilities of Google BigQuery. So, they tried out Snowflake and Vertica. In addition to being 6–12 times faster, and having no problem with the concurrency, Vertica needed less infrastructure to do the job. Snowflake used 24 cloud instances. Vertica used 6. Vertica did the same work as Snowflake on one quarter the infrastructure.

It adds up.

If every company adopted Vertica, we could cut global analytics energy usage in half, at least. (Not such a wild claim now, right?)

And that is exactly what this planet needs.

That and this gratuitous butterfly picture. ;-)

Checkered skipper light blue and brown spotted butterfly with wings spread on straw
Gratuitous Earth Day butterfly who believes in Vertica for ESG analytics

--

--

Paige Roberts

27 yrs in data mgmt: engineer, trainer, PM, PMM, consultant. Co-Author of O’Reilly’s : "Accelerate Machine Learning" “97 Things Every Data Engineer Should Know”