Sorting product “baseball cards”: bridging behavioral interviews and prototypes

Andy Matuschak
Khan Academy Early Product Development
4 min readAug 24, 2018


We’re excited to share a user research method that helped us bridge the gap between wide-ranging behavioral interviews and detailed validation of specific value propositions. We made quick progress by asking teachers to talk through how they’d rank “baseball cards” describing hypothetical products we’d synthesized out of insights from a first round of interviews.

For many months, we’ve been exploring: how might we help students build deep understanding through writing activities on online learning platforms? We’re initially focusing on AP-level history, though we’re trying to build something general. Answering that big question has meant finding good intersecting answers to four separate questions:

  1. Pedagogy. How might we define scaffolds and feedback mechanisms around open-ended activities to best support students in building understanding?
  2. Interface design. How might we design interactions in these online writing activities to support, generate, and execute our pedagogy’s scaffolding and feedback?
  3. Value proposition. How might these kinds of writing activities solve burning problems for teachers and naturally integrate into their practices?
  4. Content creation. How might these writing activities best fit into our broader curricula, and how might we frame their structure so they can be practicably authored at scale?

In this post, we’ll focus on the value proposition question. We were confident that if we succeeded with the pedagogy and interface design questions, we’d be creating something that could really help students. But we felt we could have the most impact reaching students through the rich environment of a classroom — a context which layers on its own significant opportunities and challenges.

We knew that we’d need to deeply understand how these activities might be used in real classrooms, as part of a teacher’s broader mental models, classroom practices, and feedback mechanisms. With that understanding, we could shape the activity’s surfaces so that what happens outside the activity resonates with what happens inside the activity, amplifying both sides. And critically, that understanding would help us frame these activities in a way teachers would be thrilled to use.

We knew that we’d have to talk with plenty of teachers to build the understanding we’d need, so we began by recruiting a few dozen teachers who were willing to chat with us. Our first interviews stayed broad: we asked teachers about their background, environment, practices, values, fears, challenges, dreams. We gleaned all kinds of interesting insights, like:

  • Most AP classes are now open enrollment (i.e. any student can join) meaning teachers are contending with a huge range of student capacities.
  • Even in history, many teachers had over a hundred students. Grading a single essay might take weeks.
  • Teachers were apprehensive of the AP’s increasing emphasis on disciplinary skills over content knowledge; they lacked strong resources for preparing students to e.g. analyze historical continuities across time periods.
  • They often worked on those skills through peer work in class, but those activities are hard to orchestrate.
  • There’s so much to cover that homework time has to be spent almost entirely on readings.

After synthesizing these themes and identifying our key personas, we wrote six different pitches for a hypothetical product: different angles, different pain points, different timings. Collectively, they spanned the space of opportunity we saw. Once we had our pitches written out, we formatted them like little product “baseball cards” for easy comparison and arranging.

Then, in another round of interviews with different teachers, we abbreviated the broader user research questions to focus on a card sorting activity. It went like this:

  1. We showed teachers all the cards and explained that they represented hypothetical products.
  2. We asked them to read through them one by one, verbally reacting along the way.
  3. We told them they could try a prototype of one of these products in their class next week and asked which one, if any, they’d be most interested in trying.
  4. We asked them which one seemed least useful. Then we had them rank-order the others in the middle.
  5. Finally, we asked them how often they’d be excited to use each of the cards in their classes.

Their rank-ordering was incredibly useful: it helped us eliminate a few possibilities and shined light on a promising subset of the space. But teachers’ verbal reactions as they read the cards were equally helpful. The pitches prompted teachers to connect those ideas to others in their experience. They connected our cards to existing practices, told related stories from their classroom life, and imagined aloud how to integrate the product. Their words brought the hypotheses alive and helped us share our findings convincingly with stakeholders.

We took all that insight and refined our existing designs to accord with the pitches teachers had been most eager to try. We didn’t just pick the most popular baseball card: the pitches existed as a fulcrum to understand teachers’ mental models and needs. We used what we learned to synthesize a solution that could fulfill several of those cards, as well as others we hadn’t yet written.

For our next phase of interviews, we abbreviated the card sorting activity and focused on introducing teachers to a real prototype. This new set of teachers loved what we’d made (which helped validate our synthesis), and as with the pitches, the prototype prompted the teachers to share more interesting thoughts, feeding our iteration. Meanwhile, we kept our promise with the teachers who’d done the card sorting activity: we set them up with our prototype for a live classroom pilot. More great learning ensued.

It’s always a struggle to build a bridge from behavioral interviews’ broad insights to a viable prototype. By writing these intentionally-distinct “baseball cards,” we turned the initial interviews into something concrete — but not so concrete as a prototype, which often over-constrains the conversation. And while it might have seemed like we could have written those pitches without the initial broad interviews, that too would have constrained the conversation too early. At least to me, this gradual narrowing felt just right.

Originally published at



Andy Matuschak
Khan Academy Early Product Development

Wonder, blunder, salve, solve! Exploring empowering future possibilities in education with team at Khan Academy, where I lead Early Product Development.