Game Balance in Secret Hitler
Secret Hitler is a hidden-identity game I helped create, and you can get it right now on Kickstarter. For the last week, to give people a sense of what it’s like to play, we’ve been streaming it live on Twitch.TV.
If you’ve been watching the Twitch streams, you know that games of Secret Hitler can be… intense. Games end in cheers and table-flips. Immediately, everyone on the losing team desperately searches for some other scapegoat to blame; everyone on the winning team takes laps around the room, or begins weaving Bardic tales about the littlest shit.
This is what I love about the game — each time you play, it feels epic and important. But those kind of dramatic stakes come with a heavy cost in a game. If the game isn’t balanced on a razor’s edge, one team is guaranteed to have a miserable time.
Secret Hitler is a game with asymmetrical teams: the fascists have more information, the liberals have more players. (I wrote about the game design behind this asymmetry in further detail here).
If one team’s advantage vastly outweighs the other team’s, the game won’t be fun for players who end up on the shitty team, and if half the table is guaranteed to have a bad time every time, no one will play. Balance was, to say the least, something we had to get right.
In the design process of Secret Hitler, we playtested the game hundreds of times, and anecdotally, we found the game to be balanced with all numbers of players. But our playtests had a relatively small sample size… we could only play so many games a week, and we were constantly tweaking the rules in other ways, so they weren’t all comparable. When we released a Print and Play version of Secret Hitler, most players told us that the game felt balanced and exciting, but some players approached us to say that the game seems to be lopsided in favor of the liberals. It wasn’t very many, but two things worried me:
- The few that did voice that concern seemed to be experienced social deduction players. If experienced players found the game unbalanced, eventually everyone would find the game unbalanced as they gain more experience.
- No one was complaining that the game was too easy for the fascists.
Testing Game Balance
We asked players to fill out a brief survey detailing their experiences learning the game as well as the outcomes of (up to) their first five games. We received almost 200 responses that reported outcomes for a total of 580 games. (A fun fact is that almost everybody who played one game of Secret Hitler played again for a second, third, or fourth time).
A really simple look at the data in aggregate tells us that Secret Hitler is balanced at a level I’m prepared to call “spooky.”
How spooky is it?
It’s pretty spooky.
Let’s start high level. Here are the 580 games we had reported:
We asked players to report results for up to five games, in order. When we break the data out according to which game the group played, the liberals appear to have a slight advantage only in the very first game.
After the first game, fascists seem to have a slight edge. This data set includes games where fascists killed two liberals and seized power, which can only happen in odd-numbered games; in even-numbered games, that advantage disappears.
How do teams win?
Liberals tend to pass five liberal policies; fascists tend to get Hitler elected. Again, a striking balance:
Teams won using their “dominant” strategy about three times as much as using their “alternate” win condition:
Here is the only significant variation between teams in the data: Secret Hitler gets assassinated a ton in the first game. It’s the only time one team or the other seems to have an advantage. I think that’s because first-time fascists try to nominate Hitler when it’s not opportune: “I nominate Jessica.” Wait, why? “Uh…” Right, so Jessica is Hitler, that was easy.
How should this affect my play?
One of the key things we learned from this data is that the Hitler mechanic is not just part of the game’s theme, it is the core of a successful fascist strategy.
If you or your group is having trouble gaining purchase as the fascists, consider focusing on making Hitler trustworthy, rather than just passing fascist policy. The risk of nominating Hitler is critical to disrupting liberal bonds of trust; otherwise, it’s too easy for liberals to find each other. Even fascists who win by passing six policies typically do so by preventing otherwise trustworthy liberals from being elected.
From a design perspective, this is immensely gratifying. The WWII theme was part of the game from the beginning, and most of the mechanics were designed around modeling real-world systems. We had no way to guarantee that the Hitler mechanic would be this important, so I’m extremely pleased that it has organically become the most important part of playing as a fascist.
Why should I care about this if I’m not a nerd?
Through all this, playtests are still king. I only ever want to use data to contextualize or deepen what we hear from players. It’s never a substitute. That’s why it’s actually most important that the game feel balanced: as long as players are having fun, who gives a shit. This is the principle that casinos are built on: as long as players feel like they have a chance, it doesn’t matter whether they actually have a chance.
Still, there is no casino here, no House that always wins. And there shouldn’t be one: if the game is imbalanced, then over the long run players will find dominant strategies and exploit them. I know it. You know it. If there’s a fascist strategy that works 70% of the time, eventually it’ll stop being fun to be a liberal. You’ll look into your envelope, see an old lady instead of a lizard, and know you’re in for a boring, awful hour. Eventually we’ll have to bolt some fucking wizard mechanic onto it so players can find a dopamine vein again.
Frank Lantz, the director of NYU’s Game Center, gave an incredible talk called “Hearts and Minds” about this. The whole thing is worth watching if you’re interested in this, but here’s my favorite bit:
“The dilemma of quantitative, data-driven game design… So here’s an analogy: Imagine you have a friend who has trouble forming relationships with women, and he tells you, ‘I don’t know what I’m doing wrong. I go on a date, and I bring a thermometer so I can measure their skin temperature. I bring calipers so I can measure their pupil, to see when it’s expanding and contracting.’ The point is, it doesn’t even matter if these are the correct things to measure to predict someone’s sexual arousal. If you bring a thermometer and calipers with you on a date, you’re not going to be having sex.”
Maybe a dominant strategy will emerge later down the line and we’ll have to count on Secret Stalin to bail us out or something. This data is very preliminary, and I’ll be interested to see how the game evolves. But I know that in our play group, the game still feels fresh. Here’s part one of the most intense game ever, which happened just a couple days ago:
I still love playing Secret Hitler.
Addendum: what next?
There are two questions I’d like to try to answer next. For one I lack the data, for the other I’m not sure what measure to use.
First, I’d like to know how group size affects outcome. Since we didn’t ask for group size, I can’t say. I’d bet odd-numbered games are easier for fascists than even-numbered games, but it’s hard to know for sure how big the effect is. (If you want to help me gather more granular data, you can use this form for games beyond your fifth game: Secret Hitler Single Game Report)
Second, I’d like to know how much variation there is within a single group’s games. On one extreme, it could be that in some groups, fascists always win, and in others, liberals always win. That would mean we gave pre-existing social dynamics too much control. On the other hand, maybe every group leans 50/50; that might be too even a split, as if the players themselves don’t matter, as if we made flipping a coin more stressful.
Obviously, based on the tremendous emotional response we’ve seen to the game, I don’t think we’re even close to either of those extremes; a rudimentary pass suggests that some groups play better as fascists and some as liberals, but overall there’s balance. I think that’s a good thing! Groups should have to figure out how to make one team or the other work. I’m interested in creating a measure that helps me understand this better.