Gambling

Sam Stone
Structured Ramblings
10 min readJul 7, 2021

--

I’m not very good at poker. Or blackjack. Or any game you’ll find in a casino. But I’ve come to believe that a gambling mindset — making bets, learning from their outcomes, and making new bets — is the key to making good forward-looking decisions. I’ve learned to apply a gambling mindset to product development, but the same lessons are widely applicable — to investing, marketing, sales, hiring and beyond.

I grew up in the Midwest, with parents who were both lawyers. It was an extremely risk-averse household. Gambling was a dirty word. Risk was something to be avoided, not managed, much less sought out. There was a knowable way the world worked, where the right inputs led to the right outputs. No one said the rules were obvious — to understand how things worked you had to study a lot and work hard — but we believed there were rules, we just had to learn them.

To borrow from author Annie Duke, I grew up thinking the world functioned a lot like chess, not like poker [1]:

Chess…contains very little luck… If you lose at a game of chess, it must be because there were better moves that you didn’t make or didn’t see. You can theoretically go back and figure out exactly where you made mistakes… Poker, in contrast, is a game of incomplete information…There is also an element of luck in any outcome. You could make the best possible decision at every point and still lose the hand…Once the game is finished, you try to learn from the results, but separating the quality of your decisions from the influence of luck is difficult.

Over the years, I’ve come to believe that most of business is more like poker than chess. Both luck and skill play a role, in ways that are hard to separate, even in hindsight. Nonetheless, parsing outcomes for luck versus skill is not only worthwhile, but essential, for any meaningful learning.

1. Denominate your bets in the same currency

Imagine playing poker where everyone at the table was betting in different currencies, and you didn’t know the exchange rates. It would be a nightmare! You wouldn’t understand who was taking what risk and when, and both your decisions and your outcomes would suffer.

This is the situation that confronts product managers who have a variety of possible ways to serve users better. How do you compare across different bets, when the types of outcomes they generate are different? In my job at Opendoor, working on pricing algorithms we use to buy and sell homes, we can help users in very different ways. We can:

  • Expand our services to new users (launch new cities, buy new types of homes)
  • Increase the accuracy of our home purchase offers
  • Decrease the user friction required to receive a home purchase offer

It’s relatively easy to compare two projects focused on expansion, since there is a natural yardstick: by how much will this project grow our user base, if executed successfully? It’s harder to compare — and thus prioritize — an expansion versus an accuracy project, because the units of benefit from the two projects are different.

So we had to learn to estimate the benefits of different project types in the same units. We realized that the benefits of all three project types mentioned above can all be estimated in terms of user growth:

We have to make intermediate assumptions, but it’s been surprisingly easy for teammates with shared context to agree on those. And once we have those, we can visualize and compare nearly all our major proposed projects on a shared yardstick, which looks something like this:

This chart is not a panacea when it comes to planning. Not every potential project can be plotted on it [2]. It focuses on potential benefits, not costs. It’s one tool amongst many that we use in decision-making. But it has helped us avoid some bad decisions. For example, before developing this “yardstick”, our group focused heavily on accuracy and paid little attention to reducing user friction. But denominating both accuracy and friction-reduction projects in the same units led us to realize that there were many friction-reduction projects with high potential benefit, and low expected cost [3]. If we hadn’t been able to make an apples-to-apples comparison, we probably wouldn’t have pursued these projects.

2. Understand the payoff structure

After a few quarters of using the “yardstick” approach, I noticed a pattern in how our projects would turn out versus our projections. When the dust had settled, the portfolio typically looked something like this:

We’d strike out on a big chunk, we’d come in near our forecast on a big chunk, and on a small minority, we’d deliver results far in excess of expectations.

I mentioned this pattern to another PM, who identified it as a power law distribution, a pattern well-known in many industries. Power law distributions describe venture capital fund returns, personal income levels in a market economy, and the magnitude of earthquakes, just to name a few things. I found it unusual because, like most people, I’d been conditioned to expect a normal distribution:

Gambling in a power law environment leads to an asymmetry of outcomes. [4]. In product development, it’s always possible a project delivers zero user benefit — even the most well-researched, well-scoped project can go off the rails on execution. So the “left tail” of the distribution of outcomes always includes zero. However, the right tail — or upside potential — of projects can be vastly different. Some projects have capped upside while others do not.

Here’s where it gets interesting: it’s often possible to identify before projects are launched whether they have a capped upside. For example, if you’re considering adding support for a new language to a product, new user growth cannot exceed the population of people who speak that language. Conversely, adding an API for third-party developers may have nearly uncapped upside (e.g. if the developer ecosystem really takes off, it could 10x or 100x user growth).

My goal is not to minimize the number of projects that deliver zero benefit, even though such projects are always painful. It’s not to maximize the number of projects that hit their target. [5] My goal is to maximize overall user growth. And given it’s a power law environment, that means having many projects with high upside potential.

Since high upside also often requires high downside, that means taking on a lot of risk. It’s also important that a good portion of the high upside / high downside projects be large in absolute terms. 2x outperformance on a big project can more than make up for a string of zeros on small projects, but a small project realizing 2x cannot offset a big project coming in at zero.

3. Don’t learn fake lessons

Good poker players know that sometimes they play better than their opponents and they still lose. They recognize that luck is a real element in outcomes, and that luck makes it hard to evaluate the quality of their decisions even once the outcomes are known. And great poker players recognize that sometimes they will never even know whether they got lucky or not. On some hands, they’ll fold before the turn and never see their opponents’ cards. They’ll lose, but even with the subsequent benefit of hindsight, they still can’t definitively evaluate the “correctness” of their play .

There’s a strong tendency in most organizations to believe that something to be embraced can always be learned from a success and something to be avoided can always be learned from a failure. This is what the typical organizational learning matrix looks like.

Learnings are not truly based on the quality of the decision (the input), even though its inputs that matter (because we can change future decision-making). Instead because the quality of a decision is never directly observable, we take the seductive mental shortcut of classifying learnings based on what is directly observable: the outcome.

In an ideal organization, the learning matrix would look like the chart below. We’d recognize that we can’t perfectly assess all outcomes (think back to the poker player who folds before the turn). There will be situations where the outcome was bad, but even with the benefit of hindsight, our decision still seems good (and vice versa). That’s ok; we can acknowledge “we don’t know.” It’s not being lazy, it’s being realistic.

It’s hard changing organizational behavior to not learn lessons from successes and failures (even when the discarded “learnings” are actually false)! For example, at Opendoor, our operations team analyzes every home we resell for a significant loss. In our early days, we required that every retro attribute a meaningful reason for the loss; there was no “I don’t know” option. But real estate transactions are idiosyncratic; often a well-reasoned analysis doesn’t yield a clear mistake. Our process was forcing analysts to select reasons that were their best guesses, but still ill-fitting. To make these retros more useful and reduce the risk of misleading “learnings” from false negatives, we had to add an “I don’t know” option.

4. Balance exploration and exploitation

There’s a sub-field of data science that draws its name from casino gambling: the multi-armed bandit problem [6]:

A person must choose between multiple actions (i.e. slot machines, the “one-armed bandits”), each with an unknown payout. The goal is to determine the best or most profitable outcome through a series of choices. At the beginning of the experiment, when odds and payouts are unknown, the gambler must determine which machine to pull, in which order and how many times. This is the “multi-armed bandit problem.”

The multi-armed bandit problem illustrates the explore-exploit tradeoff: if the first slot machine you pull has a good payout, do you keep pulling it (exploitation) or do you try another machine (exploration), which may pay out even more but also may pay out nothing? When we’re in exploration mode, we’re doing something to learn. We have low confidence that the outcome will match our expectations, and we’re ok with that. When we’re in exploit mode, we have high confidence the outcome will match our expectations, and we’re less likely to focus on learning, or even be receptive to it.

There’s a strong tendency to skew towards exploit-oriented thinking when the stakes are high. For example, when hiring decisions are being made, or M&A offers are being submitted, it’s rare to hear decision-makers acknowledge “we don’t have high confidence one way or another, so let’s make a decision that optimizes for learning.” But just because a decision is high-stakes does not mean it should be in exploitation-oriented. Every manager has to make a first hire; every company (that make acquisitions) has to make its first acquisition . We might as well acknowledge that sometimes we are making high-stakes decisions AND we don’t have high confidence — and thus we should treat these as exploration, not exploitation.

There’s a great lesson to be learned from high-stakes decisions in medicine. When a new, potentially life-saving therapeutic is developed, the medical community doesn’t just release it immediately to all patients. Instead, they run randomized controlled trials, the purest version of exploration there is. It’s well understood in medicine that the higher the stakes, the more critical it is to do proper exploration before moving into exploitation mode.

I’ve seen the “gambling mindset” work well for product development at mid-size and large companies. I wonder how widely applicable it is. At early-stage startups, where the team may have the resources for only one bet at a time, and only one or two failures could mean the business’s demise, it may be harder to use effectively. That said, launching a new venture is a bet in and of itself. So perhaps the gambling mindset is essential, but most useful prior to committing to the new venture’s launch.

I suspect there are many other dimensions along which organizations differ, which make the gambling mindset more or less useful. It’s not a one-size-fits all approach. That said, I’ve found that the more my peers and I recognize we don’t know all the rules or all the facts — and that we can’t wait to learn them before making decisions — the better the decisions we make tend to be.

Footnotes

[1] This post was heavily influenced by Annie Duke’s book, Thinking in Bets.

[2] For example, infrastructure improvement (aka “tech debt”) projects are often so hard to denominate in terms of user growth, that’s its not worth the mental gymnastics. But that does not mean they are not worth doing.

[3] It doesn’t make sense to select projects based on expected benefit alone; expected cost also has to be considered. For product development, I’ve found that cost estimates suffer less from unit comparison problem, because cost is normally primarily driven by person-weeks from engineers, designers, and data scientists, which allows for relatively easy cross-project comparison. This doesn’t mean that time estimates are accurate — but at least we are making apples-to-apples comparisons.

[4] Interestingly, gambling in a casino (for one visit) does not follow a power law distribution; it is normally distributed. A small share of gamblers will win big, a small share will lose big, and most will lose just a bit.

[5] It’s quite common for product managers to focus on the number of successful projects / the number of attempted projects, aka “batting average”. Most organizations bias against failure (even if they say they don’t) and having a high “failure count” is perceived as career-limiting. This is misguided when the cost base for a portfolio is relatively fixed (e.g. a product manager with a fixed team size). In this case, what matters is total throughput (number of successful projects x impact per project of successful projects), not the batting average. However, if the cost base of the portfolio is highly variable, then a batting average approach can be more appropriate.

[6] Optimizely website

--

--