dbt Cloud’s New Pricing Model: The Sinister Phase II

Why dbt Cloud Customers Should Betray Other dbt Cloud Customers in the Game Theory of SQL Bloat

Lauren Balik
11 min readAug 18, 2023

One of my all-time favorite classic articles from The Onion is about the US populace suddenly and collectively realizing that Starbucks is up to no good and will soon be entering a mysterious and dark Phase Two. As it turns out, there was an ulterior motive all along on the “Starbucks on every corner” expansion and bloat. Phase One was expansion. Phase Two is something evil.

“A Starbucks barista near the Indiana-Ohio border engages in reconnaissance of an undetermined nature.” — via The Onion

dbt Labs’ decision to tack on “model”-based pricing on top of seat-based pricing for its dbt Cloud product is both hysterical and expected — this is a company valued at over $4B that pretty obviously makes well under $100M in product annualized revenue at their current clip, plus they already have significant churn on their Cloud product, plus the TAM just isn’t that big.

Of course they were always going to increase prices on dbt Cloud again in some way.

If you’re not exactly familiar with what charging on “models” means, basically dbt Cloud is now charging on the # of SQL transformations run through dbt Cloud in a given month time frame, on top of also charging for seats. They are now going to start charging customers on the bloat that they’ve promoted.

If you’ve been following along with my blog over time, you know my opinion for over a year is that all of these companies that sit in and around the cloud data warehouse are going to raise prices, especially the ones with $1B+ valuations, as they are just too far off any valuation that doesn’t put their last round or two of investors underwater. This is a risk for you, the customer, when you buy SaaS from companies way too ahead of their skis.

If you’ve been following along, you’ll have also been keeping track of this increase in “SQL sprawl” narrative that has been coming out of dbt Labs as they and the cloud vendors get squeezed on TAM penetration and on NRR.

If you’ve been following along, you’ll have seen that dbt Labs is now pitching the idea of doing financial operations and accounting in dbt Labs, using SQL-on-SQL-on-SQL to patch together accounting rules.

If you’ve been following along, you know I see Andreessen Horowitz portfolio companies like Fivetran and dbt Labs as inherently rent-seeking to the extreme, with tactics around recruiting, price models, and strategically withholding roadmap items among other things as one-sided, with significant surplus value accruing to the vendors and early VCs and in many cases the customer is very quickly underwater on the deal.

The more you build dependencies on VC mega-backed point solutions, the more power each of these point solutions holds over you for something as simple as moving data from System A to System B.

The more you built layers and layers and layers of SQL stacked up on more SQL on the cloud, the more screwed you are.

So here’s what we’re going to do with this article here.

First, we are going to go over the game theory of being a dbt Cloud customer (this also applies to a lot of software) and why it is in your best interests as a dbt Cloud customer to betray your fellow dbt Cloud customers by defecting.

In the second part, we are going to go over a number of ways in which you can possibly defect from dbt Cloud or limit your exposure to give you a more optimal outcome.

Why Defecting from dbt Cloud and Letting Other dbt Cloud Customers Hold the Bag is in Your Best Interests

When more defections happen in earlier turns, non-defectors bear even higher costs in later turns.

Here’s a simple game. The are 10 fixed customers all using 1 vendor. There are no new customers entering the game. The vendor’s goal is to get to $200k or more in a turn by turn 3 to sell the company.

In turn 1, each customer pays $10k to the vendor, and the vendor is at $100k. Now, in order to get to the $200k, the vendor doubles prices after turn 1.

This is where things get interesting.

If no customers churn, then the vendor is at $200k and can sell the company. But, doubling prices will often lead to some amount of churn.

On the left in this example 2 customers churn. They found a better solution or decided they wanted no solution. For whatever reason, the price increase was not worth it for them.

Now, the vendor took this feedback. The vendor believes that if only 2 churned before, that he should optimize for 6 customers to remain on a new price increase.

So to get to the $200k the vendor makes prices $33,333.34, 2 customers churn, and the remaining 6 get the vendor to the $200k needed to sell the company.

On the right in the example, 4 customers churn after the price change after turn 1. The vendor has gone from $100k a turn to $120k a turn. Because more people churned and fewer customers remain in the game, now the vendor has to get even more aggressive after turn 2 and raise prices even more. He can get to his goal with 1 $200k customer. He can get there with 2 $100k customers. 3 $66,666.67 customers. 4 $50k customers.

Obviously the world is more complex than this, but a few things ring true:

  1. If many defections happen earlier, a smaller number of non-defectors will pay more extreme cost increases in later turns.
  2. If few defections happen earlier, the vendor will keep raising costs to do price discovery on customers.

This is why you need to defect now. If you don’t get to a lifeboat early and a bunch of other customers are heading to lifeboats, you are risking paying usurious fees for the lifeboat on the next turn.

dbt Labs can’t even figure out proper messaging to their customers about how this pricing change is going to occur.

Using the Wayback Machine, you can see that on August 10th dbt Labs announced their new plans with 20k model runs (one run of a SQL model or transformation in dbt) a month included on the Team plan, and 5k included on the Developer plan.

Then, after a community feedback pow-wow about the new pricing model, dbt Labs then changed this to 15k models included on Team and 3k on Developer on August 11th.

They are just winging it. They have the legalese and price changes in a public PR.

One week it’s one unannounced pricing change. The next week it’s a different unannounced pricing change.

Meanwhile, dbt Labs is now continuously promoting that their customers do their financial operations, revenue recognition, and accounting in dbt Cloud. They just want the workloads. They want to get sticky in a line of business where change management is difficult.

Do not waste your time and do not take the risk of running your accounting logic through a SQL compiler that can’t even get their pricing announcement correct. I cannot stress this enough.

dbt’s Race to $150M through Batch-Gating

So, to make this more practical, let’s cut to the chase: dbt Labs, their CEO Tristan Handy, and their VCs and board have a number.

I think dbt Labs is gunning for a range of $150M in annualized revenue from their product (not what they make from professional services or on their conference). Maybe more, maybe less, none of this is concrete even for them. But they need to get to a range of reality and likely they need to hold revenue in this range for a year. Maybe it’s $125M. Maybe it’s $175M.

There is absolutely a number.

This pricing change is just the latest stage in Modern Data Stack price gating of batch.

Here is a quick, back-of-envelope chart showing what charging by $.01 per model run adds in monthly and annual costs. Obviously the world can be more complex, and some larger customers will negotiate unit costs, but this is how daily builds of dbt models can play out.

Basic cost estimates for building SQL models at the cost of $.01 per model build.
From dbt Labs’ “The Next Step Forward in Analytics Engineering” showing aggregate dbt model sprawl across the market.

Now, what is really going on here is two thing:

  1. This continued promotion of SQL sprawl by dbt Labs (which they are now going to charge for dbt Cloud customers, but which also makes cloud credit burn regardless of dbt Cloud vs. dbt Core).
  2. Gating on batch processing.

The first is obvious. The second may be less obvious, but if you look at the back-of-envelope on the back-of-envelope chart there are two obvious ways to optimize within dbt Cloud now.

Just by having things update every hour, or 24 times a day, dbt Cloud is charging on latency. If you don’t want to pay the upsell, you can move your 24 hour a day updates to 12 hour twice-daily updates. You can move your 4 times per day cadence down to once a day.

You are now paying a premium on dbt Cloud as well as on Snowflake or BigQuery or similar just to have data refresh more often than once or twice a day.

Defecting from dbt Cloud is the Only Thing that Makes Sense

So if dbt Labs needs significantly increase their revenue in the next year or so, and they have a number in mind, and they are going to charge you for building SQL models now, and you are just one of a few thousand customers, and other customers are also now defecting and planning to defect…the only thing that makes logical sense for you is to also defect.

They will take the lessons they’ve learned over the next few months as price discovery on you and several thousand other dbt Cloud users, and if many people churn, anyone remaining will pay even higher unit costs.

The longer you have hundreds to thousands of dbt models compiling every day, often multiple times a day, the more exposed you are to more and more price increases. You’ll be paying multiples more for writing tons of SQL.

How to Defect

E(t)L(t): Shift Stuff Left to Improve Cost and In Most Cases, Latency

Many companies find themselves in a dbt model soup because they query everything post-production once it’s been shoved in the data warehouse. I’ve never once in the last 10 years thought this should be a be all, end all, and in the data cloud world it’s clearly heavily to the benefit of Snowflake because you use their compute to transform, it’s to the benefit of Fivetran because they promote EL, and it’s to the benefit of dbt because you roll back up all the Fivetran tables using dbt, and this gives both Fivetran and dbt Labs cloud credit attribution.

There are many great vendors out here that can allow you to remove dependencies on dbt models. By shaping your data and defining it before you land it in a destination where it will be queried, you save on many dbt models, possibly most or all. In alphabetical order, here are some:

Use Coalesce.io Instead of dbt Cloud

Out of the box, and without fanfare, Coalesce offers a number of things like column-level lineage (table stakes if you have many tables and objects) and with the ability to do what dbt does with Git and versioning, plus offering a GUI if a non-technical users wishes to make transformations.

Many organizations use Coalesce to accomplish in weeks or months what dbt Cloud can. It is and has been a much more complete product than dbt Labs products.

Host dbt Yourself

There are many ways!

Model Data with an Activity Schema

This is a good way to limit models, limit cloud spend, and treat everything properly. I believe many digital businesses using cloud data warehousing will be using the Activity Schema as proposed by Narrator or similar entity modeling styles within two years, as the reduction of sprawl and efficiencies gained make this very attractive.

Partially Defect

This is perhaps the most dangerous. Partial defection is things like consolidating dbt models or moving your cadence of updates from every hour to every 6 hours to avoid getting dinged by being charged on models.

You are basically just playing defense against dbt Cloud, which if enough defections occur, and they will, will merely raise prices again because too many organizations are too locked into having business logic spread around in 5 or 6 different layers of dbt. This is just an Advil and it is temporary. dbt Labs is so far off from where it needs to be (over $100–150M) that after they run this most recent price discovery experiment, they will absolutely raise prices again.

Why not defect fully while you still have the time?

Final Thoughts

We are entering the funny phase of the cloud data warehouse holding so much power. dbt Labs has been a major driver of cloud data warehouse revenue growth the past 4 years, at least at tech companies and downmarket, and much of this is due to sprawl of SQL and then throwing more compute at the problem of SQL sprawl when it becomes untenable for latency reasons.

One of the funnier things for me is that dbt Cloud’s pricing changes will actually tick off all of the cloud data warehouse and databases. This pricing change is going to force many customers to move workloads out of the Snowflake and BigQuery “SQL maximalism” mentality.

The more people that move away from dbt Cloud and reduce their dependency on dbt Cloud or Core the more likely it is anyone remaining tied to dbt Labs products is going to get more rugs pulled out from under them, as there is no way to commercialize dbt Labs and exit it for even in the high hundreds of millions without continuing to price gouge customers through levers like increasing seat prices, charging more for models, or even changing their dbt Core license.

Do you want to focus on solving actual business problemsI thought that was the whole point of analytics and BI — or do you want to have to continuously play defense against an extremely overvalued entity that will throw the word “community” down your throat ad nauseam while picking your pocket over and over again?

Get out now. Many others are defecting.

If you have a good story about how you’ve churned off dbt Cloud, feel free to email or message me or share it with me; I’m happy to help share the good work you and your team are doing.

--

--

Lauren Balik

Owner, Upright Analytics. Data wrangler, advisor, investor. lauren [at] uprightanalytics [dot] com