A lot of smart people have recently pointed out how becoming a polymath might be the most future-proof self-improvement strategy on the market. A generalist knows how to learn and relentlessly applies this skill to a broad variety of topics. A valuable asset in our ever-more complex and unpredictable world “calling for range, not specialization.”
That argument sounds like a fat plus in the Pro column for ‘read widely’. At first glance. But when you think about it, take another look, it’s a case formulated in the language of speed and efficiency.
The efficiency argument
You’ve read a lot. You have a grey beard. You get to the gist of things in above-average speed. You have a broad knowledge-base, which facilitates swifter learning, which widens your repertoire of concepts, which in turn makes you better equipped to understand new materials, and so forth. An upward spiral.
As a polymath, in other words, you have an economy-of-scale advantage that makes your learning curve increasingly steeper.
“This ironically then allows you to specialize in something else faster if you so choose,” Zat points out (my emphasis). “An incredibly valuable advantage.”
This is probably the procedure behind why, as Ryan Holiday has pointed out, when the number of connections you’re aware of increases, the ROI of reading grows exponentially. A generalist might have the ideal skillset for leveraging this mechanism.
Is it lunchtime yet?
Granted. Nonetheless, a feeling in my stomach tells me there’s gotta be more to it. And that’s not because I’m hungry. Because I’m not. I just had breakfast.
The efficiency-argument isn’t necessarily wrong, but it’s not the whole story about what makes generalism a smart strategy.
On a deeper level, the more profound advantage of being a polymath is not about having a steeper learning curve, but about being on a different learning curve altogether.
The connections you’re aware of don’t just grow in number — quantitatively. They increase qualitatively.
Hedgehogs and foxes
I admit: “Quality of connections” sounds rather vague. Does ‘higher quality’ mean these connections represent information that’s worth more money, or is more helpful for your life, or has certain aesthetic properties, or …?
I’m going to try to explain what I have in mind by correcting a common misunderstanding about the popular fox-hedgehog distinction.
By the end of the essay, the following should make sense: Being a polymath is not about being a fox.
Theory or data?
Insofar as academic philosophy ever makes it into common parlance and it’s not about DOES THE TREE MAKE A SOUND WHEN NOBODY HEARS IT OMG SO DEEP, Isaiah Berlin’s distinction between “hedgehogs” and “foxes” is pretty fashionable among folks who like to talk about mental models and tend to read articles like this.
In it’s simplified version it amounts to a fox knows many things, but a hedgehog one important thing. Beneath the surface, however, the contrast isn’t so much about what you know (many or few things), but about how you think (inductively or deductively).
In Superforecasting: The Art and Science of Prediction, political scientist Philip Tetlock also tackles the topic:
“…hedgehog forecasters first see things from the tip-of-your-nose perspective. That’s natural enough. But the hedgehog also “knows one big thing,” the Big Idea he uses over and over when trying to figure out what will happen next. Think of that Big Idea like a pair of glasses that the hedgehog never takes off. The hedgehog sees everything through those glasses.”
This takes us away from understanding hedgehogs and foxes as a matter of beliefs about the world (what you know), to understanding the difference as a matter of beliefs about how to reason (how you think).
Watch out for confirmation bias
Many people, like Tetlock, see the fox’s mindset as preferable to the hedgehog’s on the grounds that it’s less prone to fantasy and dogmatism.
Foxes can do many things; hedgehogs can only do one — curl up into a ball. Hedgehogs pitfall, accordingly, is that they have a hard time seeing outside their own mental models. This makes them vulnerable to confirmation bias — they start trying to make the world fit their own theories. When reality falsifies their Single Defining Idea, they’d rather fiddle with reality than update their worldview and discard their cherished Doctrine.
When you only have a hammer, the whole world starts to look like a nail.
And the common understanding seems to be that, over at the other side, polymaths shine as versatile foxes. They know lots of different facts from all kinds of fields. Bit by bit, they build up their ultimate map of reality. Adding lots of local observations, resisting the temptation to move to theories and global beliefs.
Who needs theory when you have facts?
I believe many people tie too much of their mental conception of what good reasoning looks like to the stereotype of the humble empiricist fox.
Their mental contrast between empiricism and theoreticism is so strong that they think it’s unsafe to have a theory at all. That having a theory makes you into a bad hedgehog with One Big Idea who always finds a way to hold on to it despite contrasting observations.
This, however, is an overreaction to the hedgehog’s danger of seeing the same ‘explanation’ everywhere, independent of whether it obtains.
While many assume it’s better for people to try to be pure foxes and only collect observations and try not to have any big theories, it’s not.
Here’s why: the hedgehog knows nothing because he has only one theory and thus sees the same thing everywhere he looks, which is the same thing as being blind. The fox understands nothing because he has no theory, which is hardly an improvement.
Facts are nothing without interpretation.
Which ‘one of those’ is this? Why polymaths aren’t pure foxes
To interpret facts, we rely on mental models.
A mental model is a representation of the surrounding world. An abstract representational model of a certain region of reality supposed to help you in understanding things. It achieves this by being simplifications of the territory they represent, not by perfectly representing the world:
A subway map can distort reality to better help you navigate. Likewise, we humans can distort our view of the world to better help ourselves navigate life. — Charles Chu — Mental Models, Dragonfloxes, and How to Think Real Good
The key thought behind mental models is that every problem can be simplified as another ‘one of those’ — another one of a certain type. Mental models allow us to ‘fit’ different possible interpretations onto reality to see if it is ‘one of those’.
For instance, according to Pareto distribution, “for many events, roughly 80% of the effects come from 20% of the causes”. On the other hand, especially in murky cases, it’s intuitive to start by assuming a linear model according to which the important contributing factors have more or less equal weight.
When reviewing your client database, should you assume your income is distributed equally over paying customers (and only remove the non-paying ones), or should you assume that 20% of your customers bring in 80% of your cashflow, and also remove the paying customers from the bottom 20%? Which ‘one of those’ is this? A linear distribution or a Pareto distribution?
Thinking in terms of ‘one of those’, simplifying and categorizing situations, though, is theory through and through.
Mental models are not facts
Many situations are going to look alike in many respects, yet there will be some properties shared by every case to which the Pareto distribution applies (and only those). But what are these distinctive features? What are the relevant similarities shared by all circumstances (and only those) to which the Pareto distribution applies?
What are the very few cues you have to pay attention to infer it’s ‘one of those’? What is the slice of data that enables you to categorize this state of affairs as ‘one of those’ and get a deeper understanding of it?
What is the hidden signal?
“I think this is a non-linear Pareto situation where 20% of customers bring in 80% of the revenue because there’s a lot of cold calling involved in sales relations.”
How do you know that’s an indicator? Why does that matter, and not, for example, the male-female distribution on the marketing team?
Facts are nothing without interpretation, and interpreting the facts at hand requires a theory that explains why these similarities are the relevant ones — why situations that have in common that there’s cold calling involved in sales relations follow the 80/20 rule.
The facts you’ve “collected” enable you to test models of reality and craft unique theories of why surface-level resemblances occur, which characteristics are the significant ones and why a thing is “one of those”. It’s theorizing about the patterns and testing these models that enables you to deduce the significant characteristics of a particular case.
It’s your theory that tells you the nature of the customer contact is the characteristic that guides which ‘one of those’ this might be and is the feature that allows you to make inferences about a situation without having to sift through all the data and calculate whether a Pareto distribution applies here.
The advantage is in having better theories, not in knowing more facts
Theories, not facts, allow you to discover the deep structural similarities between superficially different situations. Facts alone don’t tell you which similarities are relevant. Hence ‘knowing more facts’ is a rather shallow information advantage in itself. It doesn’t benefit you in filtering the signal, which is what it’s all about.
An empiricist who is only allowed to rely on facts, can only learn surface generalizations about whether this phenomenon superficially “looks like” that phenomenon or not. If you just “stick to the facts” you can’t move from surface-level resemblances to more technical understanding of particulars.
By contrast, the strength of theories is that figuring out what type some problem is and categorizing it as ‘one in which Pareto’s rule applies’ (or not) will help you discover deep structural similarities between superficially different phenomena.
The hedgehog’s shortcoming is that he can only see the world in one way. But avoiding that mistake by throwing out models altogether — “just give me the facts” — goes too far in its skepticism of model-building and deductive application.
The fundamental error, then, that causes one to overlook these uses of theory is this: just because I have a theory does not mean I have to be insensitive to the evidence:
During and after the construction of that model you need to look at the data. You still need fox-style attention to detail — and you certainly need empiricism. Developing accurate beliefs requires both observation of the data and the development of models and theories that can be tested by the data. In most cases, [if you want to achieve understanding] there’s no real alternative to [going beyond the facts and] sticking your neck out, even knowing that reality might surprise you and chopp of your head. — Eliezer Yudkowsky, Inadequate Equilibria
Now, to close, let’s return to the topic of generalists.
Seeing better, not knowing more
Polymaths might be broad and effective learners, but they are, above all, magnificent theorizers. Their ultimate advantage is not in the breadth of their knowledge but in their ability to understand things on a deeper level.
This ability, in turn, is not dark magic but bottoms out in (i) the models, not the facts, in your head and (ii) your ability to identify collections of facts as the correct ‘one of those’.
Only when you have models you can use analogous structures to make deductive inferences about any given phenomena or system without having been exposed to that system and having played with it beforehand.
The point of this, in turn, is not to have learned more, or have the means to learn (even) faster. This high-quality thinking makes you’ll discern unique signal where others just detect noise. The ultimate advantage is that you’ll interpret better, assess more accurately and understand deeper what works — why, how, and when.
A wide array of tested theories about which very few properties of situations determine which ‘one of those’ it is (signal) and which are not relevant to the categorization of some circumstance (noise), helps you ‘see’ what’s going on and make connections across traditional boundaries. Connections other don’t see.
Being a generalist can be powerful. Steve Jobs was not a specialist or a technician; he was a conductor who could connect dots others couldn’t.
The hidden sound of things reaches polymaths, and they listen reverently, while in the street outside people hear nothing at all.
There’s more to that
If you’re looking for more fictional people with gray beards waving a stick, please subscribe to my personal blog. You’ll get a weekly dose of similarly mind-expanding ideas.
If you want to become more skilled and be prepared for a better tomorrow, check out SkillUp Academy and follow us here! Let’s SkillUp!