Collective Decision-Making with AHP

How the NYT Identity team tried out the Analytic Hierarchy Process to select a user ID format.

The NYT Open Team
8 min readOct 13, 2022
Illustration by David Huang

By David E. Wheeler

When in the course of Engineering events it becomes necessary to make a decision between alternatives, how does a team go about it? Often a single engineer decides, based on their expertise and experience, and where the scope is narrow enough not to need the input of the whole team. But what about decisions that impact many teams, processes, future technology decisions, and organization-wide use-cases? Who decides then, and how?

The Identity Team at The Times, responsible for building and maintaining identity and authentication services for all of our users, has embarked on an ambitious project to build a centralized identity platform. We’re going to make a lot of decisions, such as what languages we should use, what database, how we can best protect personal information, what the API should look like, and so much more. Just thinking about the discussions and consensus-building required for a project of this scope daunts even the most experienced decision-makers among us. Fortuitously, a presentation at StaffPlus NYC by Comcast Fellow John Riviello introduced a super fascinating approach to collective decision-making, the Analytic Hierarchy Process (AHP).

AHP participants compare pairs of evaluation criteria to establish their prioritization. Then they compare each pair of alternatives for each criterion. For example, Wikipedia’s leader example compares four criteria (six pairwise comparisons) for three candidates (three pairwise comparisons for each of the four criteria).

We thought AHP might be a good tool to ensure a careful evaluation and choice we could be well-aligned on. But getting a sizable group of people together to run such a session on a slew of criteria and options will take some doing: they have to decide on the criteria and diligently evaluate each option in advance. Depending on the number of options and criteria, the geometric multiplicity can get quite large. A more detailed example for choosing a family car evaluates six models against 10 criteria for a total of 120 pairwise comparisons. It’s a lot.

An Experiment

The team decided to trial the process without making a huge time commitment. We just needed a more focused problem space to run through AHP. Fortunately, in designing the user schema for identity — which defines the types and structure of personal information about Times users — we have recognized the need for a new, canonical ID for New York Times readers. (Privacy is essential when handling identifiers. We won’t cover privacy here, but have written about how we think about it and how we manage it in the past.)

We decided to trial AHP by evaluating a short list of ID formats against an equally short list of pertinent criteria. Over several weeks of research and diligence, we added and removed criteria and options, and settled on the hierarchy in this AHP-style diagram:

Once we felt the options and criteria were sufficient and well-documented, we got the whole team together on a Friday to run through the AHP process. Here’s how it went. (Warning: lots of data and tables ahead; if you’re not interested in the details of how an AHP runs, skip down to Discussion for outcomes and conclusions. No judgment, this is an annoyingly long post!)

​​Weighting Criteria

The evaluation criteria we settled on were:

  • Database Support: Do database engines typically support it natively, either as an explicitly data type or via binary representation.
  • Developer UX: How nice is the ID for use by developers? Is the format easy to work with, to copy-and-paste, and supported by libraries. Length is less of an issue, though shorter is still nicer than longer.
  • Distributed Uniqueness: The ability to generate a unique identifier in a distributed fashion, among a number of distributed nodes, without collision.
  • Ordering: The IDs are ordered and can be sorted, e.g. by time. Useful also for range partitioning but not hash partitioning.
  • Randomness: Is the format sufficiently non-sequential or random to minimize the chances of ID discovery by cycling through IDs. Sequences, for example, are problematic because if a user’s ID is 93823, they can probably find that 93824 and 93825 are valid IDs, too. Even ordered but not-strictly sequential IDs are somewhat easier to discover than fully-random IDs.

For each pair of criteria, we discussed the relative importance of one over the other until we settled on a consensus score. The result is a matrix of pairs, with 1 assigned to the less-important of a pair and a number between 1 and 9 to represent the intensity of the importance of the other. This table (from Wikipedia’s leader example) defines the intensity of importance for each score:

For the criteria, our final scoring was:

Here’s how to read it:

  • Developer UX is moderately more important than database support
  • Distributed Uniqueness is strongly more important than database support
  • Database support is slightly more important than ordering
  • Randomness is slightly more important than database support

And so on. Plugging these numbers into Comcast’s AHP webapp, we get this nice diagram illustrating the relative weighting of each criterion based on the AHP calculations:

Distributed uniqueness was far and away the most important criterion. Developer UX was a distant second, with the others declining in importance from there. Here’s a sorted table of the weightings, which sum to 1:

Scoring Options

Next, for each criterion we compared each pair of options, using the same numbering system. The options we settled on were:

Final Score

Plugging those numbers into Comcast’s AHP webapp gives us this final scoring:

UUID is the big winner here, with Nano ID in second place and Snowflake ID and XID trailing behind. Here’s a sorted table of the scores, which sum to 1:

Discussion

There were a couple of surprises in the outcome of this process.

We had expected distributed uniqueness to be the most important criterion, given our plan to adopt a distributed database. We had not anticipated that developer experience would rank so highly. In discussing what “developer experience” meant, we expanded it (from the original notes about ease of double-clicking for copy and paste) to include library support across a variety of languages. This led to weighting UX higher than randomness, ordering, or database support.

This expanded interpretation of developer experience also unexpectedly allowed UUID, who’s awful default format is not double-clickable (try it on this — 3c893566-c125–4741-a68a-33e91410b7e2 — it won’t select the whole thing), to win out over Snowflake, which has an aesthetically pleasing string representation but does not have the broad industry and library support of UUID.

As for the final analysis, we had expected UUID and Snowflake to end up the top two choices, and could not have said which one would win. Finding UUID to be the clear leader was unsurprising, but Nano ID handily defeating Snowflake was a real curveball. The explanation for this result almost certainly falls entirely on Snowflake’s poorer showing in the distributed uniqueness criterion. Yes, it very much supports uniqueness in distributed systems, but the need to configure unique node IDs for each instance generating Snowflakes, together with its expiration date (Snowflake IDs cannot be created beyond a certain date, generally 69 or 174 years) led to UUID and Nano ID freezing it out: neither expire or require configuration.

Personally I’m glad Nano ID lost to UUID, because Nano has quite poor database support and is much larger than UUID (21 bytes vs 16 bytes). I was rooting for Snowflake due to its small size and efficient algorithm, and because I’m not fond of UUID’s default string format. But I take consolation in the fact that we don’t have to use that format. If we Base62 encode a UUID, rather than a typical hideous string like 3c893566-c125–4741-a68a-33e91410b7e2, we can have ifWIoI9ZU00gOqkgNrmE5B. Much more compact (22 vs. 36 chars) and double-clickable (no dashes — try it!).

One way or the other, Identity user IDs will be UUIDs. And this exercise allows that choice to be uncontroversial among members of the team.

Experimental Results

We consider the AHP experiment to be a great success. The team enjoyed the process, as it gave us a chance to closely example the criteria and options, to better understand the features and trade-offs of each option, and to see our fine-grained analysis nicely captured by the algorithm. As we made our way through, it became pretty clear what the winner would be, and that’s a good thing, since we gave all of the options a proper airing and felt that the objective result reinforced our findings.

We plan to use AHP next to select a distributed database for the platform. We’ll likely have more criteria, options, and stakeholders, so perhaps we’ll need to take a full day to run through it (as opposed to the 90 minutes or so for this one). But the diligence it requires and consensus it builds will allow us to be confident in our conclusions, satisfied by the result, and agreed on the choice.

Acknowledgements

I’d like to thank the members of the identity team for gamely running through the AHP, including Brigitte Lamarch, Pete Saia, An Yu, Alex Detrick and Jethro Chu. Deep appreciation to Penina Kessler, Danielle Roades Ha, Shilpa Kumar, Shawn Bower, Tanya Forsheit and Robin Berjon for the close reading of drafts of this post that made it so much better.

David E. Wheeler is a staff engineer for The New York Times where he works on designing and building the next generation identity platform. He is a lifetime student of systems design, privacy, team enablement, culture and society, as well as fermentation.

--

--

The NYT Open Team

We’re New York Times employees writing about workplace culture, and how we design and build digital products for journalism.