F: Feedback with relevant metrics

Post no.4 in a 5-part series on basic conditions for expertise

Do you or the potential expert have feedback loops that help them accurately calibrate whether they are increasing their expertise or making accurate judgements? (Or have you/they had these in the past?)

In domains where reality does not give good feedback, they need to have a set of well-honed heuristics or proxy feedback methods to correct for better output if the result is going to be reliably good (this goes for, e.g., philosophy, sociology, long-term prediction). In domains where reality can give good feedback, they don’t necessarily need well-honed heuristics or proxy feedback methods (e.g., massage, auto repair, swordfighting, etc.). All else equal, superior feedback loops have the following attributes (idealized versions below):

  • Speed (you learn about discrepancies between current and desired output quickly after taking an action so you can course-correct)
  • Frequency (the feedback loop happens frequently, giving you more samples to calibrate on)
  • Validity (the feedback loop is helping you get closer to the output you actually care about)
  • Reliability (the feedback loop consistently returns similar discrepancies in response to you taking similar actions)
  • Detail (the feedback loop gives you a large amount of information about the difference between current and desired output)
  • Saliency (the feedback loop delivers attentionally or motivationally salient feedback)

Examples

You want to predict technology timelines

Julie and Kate both claim to be experts in technological forecasting. When you ask Julie how she calibrate her predictions, she replies, “Mainly, I just have sense for these sorts of things. But I also do things like monitor Google Trends, read lots of articles on technology, and ask lots of people what they think will happen. I’ve been doing this for 20 years.” She then points to a number of successful predictions she’s made. When you ask the same question to Kate, she replies, “Well, in the short term, it’s been shown that linear models of technological progress are the best, so I tend to use those to calibrate on the timespan of 1–3 years. If I make longer term predictions, I try to tell as many stories as possible for how those predictions may be false. Then I try to make careful arguments that rule out these stories. Furthermore, I always check whether my predictions diverge substantially from other technological forecasters. If they do, I try to figure out why. I’ve also identified a number of technological forecasters who have consistently good track records, and I study their methods, evidence, and predictions carefully. Finally, whenever one of my predictions turn out to be false, I spend about a week figuring out whether there is any general principle to be learned to guard against being wrong in the future.”

Who do you think has the better marker of expertise? Why?

Technological forecasting is a domain in which reality doesn’t provide strong feedback, so you need proxy feedback. Julie does not have good proxy feedback while Kate does have relatively decent proxy feedback methods. Barring special information about Julie, Kate’s predictions are likely to be more reliable, all else equal.

You want to choose a piano teacher

Both Ned and Megan are piano teachers. Of the two, Ned is a much better pianist, having won many awards and played at Carnegie Hall many times. You ask both Ned and Megan how they can tell whether their teaching is working for a given student. Ned replies that he simply looks at the outcomes: if a student practices under him for several years, they become much better. “Basically, I show them how to play scales and pieces well, and then I check in about once every other week to make sure they are practicing the drills I showed them.” Megan replies with a detailed set of ways she can note rate of progress and how she adjusts her teaching accordingly. “For example, I know whether a student has ‘chunked’ a given chord through the following method: I stand behind the piano and quickly turn around a piece of paper with a chord on it and time how many milliseconds it takes for a student to react and play the chord. Also, I each week I ask them to honestly report on whether they feel as if the chord is still a series of notes or whether it feels more like ‘one note.’ This indicates that the chord has become a ‘gestalt’ in the students mind. Another example: whenever a student makes an error while playing a piece, I mark the corresponding area in the sheet music. Eventually, I can then tell what types of errors a student generally makes by analyzing the darkest areas on various pieces — the places with the most pen marks.” Megan continues to tell you similar examples.

Who do you think has the better marker of expertise? Why?

In this case, while Ned may be the better pianist, he may not be the relative expert at teaching piano. It would seem he lacks relevant feedback loops to tell him whether he is successful at teaching. While he notes that his students improve over time, he is not entertaining the possibility that they may have improved counterfactually over time without his intervention.

You want to hire a manager

Both Todd and Greg have applied for a manager position at your organization. You ask each of them about their process for monitoring the rate at which their teams are making progress on goals. Todd: “I have everyone on a system where I can monitor the amount of Pomodoros each person is completing. If certain team members are lagging behind in their amount of Pomodoros, I give them a pep talk, after which the amount tends to go back up.” Greg: “I have each team member set daily subgoals. Then I look at two things: (a) whether these subgoals tend to align to the broader goals and (b) whether they are achieving the subgoals they set for themselves. If a team member is lagging behind in (a) or (b), I give them a pep talk, after which they tend to perform better.”

Who do you think has the better marker of expertise? Why?

In this case, both Todd and Greg have decent feedback loops. However, Todd’s feedback loop is more likely to fall victim to Goodhart’s law. In other words, though his method might be high in reliability, the measure of Pomodoro-maximization might accidentally become the target, even though the intended target is goal completion. Greg’s feedback loop is higher in validity, in that it measures the target he actually cares about more tightly.

Ways of assessing

I. Ask questions which will reveal the details of their feedback loops (and whether they have them), such as:

a. “Let’s say I’m already a proficient coder, but I want to learn how to code at the level of a master. What sorts of problems might I practice on to move from proficiency to mastery? Are there any textbooks I should read?” (For a software engineer)

b. “In what ways do people typically stumble when they try to improve at data analysis?” (For a data analyst)

c. “How do tell whether a marketing campaign is working?” (For a professional marketer)

d. “Can you tell me a bit about how you learn?”

II. Find out whether they’ve been part of a job, program, or mentorship what would have given them strong feedback loops.

Caveat: Many jobs, programs, and mentorships don’t cause expertise-gains, so look for jobs, programs, and mentorships with a good track record of producing talented people.

III. Check whether the world itself provides feedback in the domain

Sometimes, people with tacit expertise will not be able to articulate their feedback loops. Analyze whether reality provides robust feedback in their domain. For example, a bike-rider might not be able to describe the feedback loops through which they learned bike-riding. However, reality automatically provides feedback in the domain by causing novice bike-riders to fall over, until they accumulate enough procedural knowledge to balance on two wheels.

>> Next post: Time (and series wrap-up)

Like what you read? Give Tyler Alterman a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.