# What are mental models?

Mental models are frameworks that we can use to make sense of the world around us. In a reality as complex as ours, we need mental models to simplify concepts, make connections between ideas and identify important information from the surrounding sea of noise.

Charlie Munger, Warren Buffett’s less public (but no less successful) partner at Berkshire Hathaway, attributes much of his success to accumulating and applying hundreds of mental models.

For most of his life, Munger has been thinking about ways to think better. Rather than memorising isolated facts, he argues that you need to string facts together in a model or theory to organize them into a usable form. Importantly, your mental models need to come from a variety of disciplines and schools of thoughts so you can tackle problems in a three dimensional manner and avoid missing crucial information. No one model is perfect, but combining them gives you powerful toolkit to analyse information and make sound decisions.

Developing the habit of mastering the multiple models which underlie reality is the best thing you can do. “

— Charlie Munger

Below are summaries of 8 mental models that I have found deeply interesting and useful. This is a very small cross-section of the thousands of models that are out there. Building this ‘latticework’ of mental models is a lifelong journey, and the quality of your thinking will be directly proportional to the strength of these models.

# 1. Multiplying by zero

One of the first things you will have learned in maths is that anything multiplied by zero is zero. This is as true in life as it is in mathematics. Underlying this model is the concept of multiplicative systems.

## A chain is as strong as its weakest link

At Google X, the ‘moonshot factory’ of the tech giant, the mantra is #MonkeyFirst. The idea is that if you want to train a monkey to stand on top of a pedestal and recite Shakespeare, you start by training the monkey, not by building the pedestal. Anyone can build a pedestal, but very few can figure how to teach a monkey the nuances of English literature. Even if you build the most beautifully designed pedestal there ever has been, the ‘zero’ factor of the monkey will mean that you have made no progress towards the actual goal.

In contrast, in an additive system, each component adds together to create the final outcome. If you add zero to a string of huge numbers, the result will still be huge. Take for example, a Christmas lunch, where different family members each bring a dish. If Aunt Rachel accidentally puts salt instead of sugar into her famous pudding, there will be a problem. But the dinner still goes ahead. The stuffed turkey, gravy, cranberry sauce and vegetables are all still good to go.

The addition or subtraction of different components creates a different outcome but not a binary win/lose outcome like a multiplicative system.

Most businesses operate in multiplicative systems, but too often, think that they are operating in an additive one. Many companies continually add features to their product while neglecting to focus on key issues such as listening to customer feedback or gradually moving towards profitability. The infamous Pets.com spend millions on marketing and TV advertising without actually testing whether pet owners wanted to buy pet supplies online, resulting in one of the more public downfalls of the dot com crash.

For this reason, identifying whether you are in a multiplicative or additive system is hugely important. If the whole system depends on one or two critical components, it is essential that you address these components first.

# 2. The Pareto Principle

The Pareto Principle is the idea that a very small number of key variables have a greater impact on the final result than all other variables combined. It is named after the economist Vilfredo Pareto who, in 1906, noticed that 20% of the people owned 80% of the land in Italy, much like how 20% of the pods in his garden produced 80% of his peas.

This pattern repeats itself over and over again in both the natural and social worlds. More than 35% of the world’s population is concentrated in two countries. The most destructive earthquakes are much more powerful than smaller earthquakes combined. The largest companies in the world dwarf the value of every smaller business put together.

This 80/20 rule is used by venture capitalists when investing in startups. Peter Thiel, co-founder of Paypal and one of the most respected investors in Silicon Valley, calls this the power law. Venture returns are not normally distributed: that is, bad companies fail, mediocre ones stay flat and good ones return 2–5x. Rather, they follow a power law: a small handful of companies drastically outperform all others. For example in Thiel’s fund, the 2005 Facebook investment returned more than all other investments combined.

This leads to a strange rule for venture capitalists: only invest in companies that have the potential to return the value of the entire fund. But because you can’t know with certainty which companies will succeed, even the best venture capital firms have a ‘portfolio’ of 5–7 companies.

This type of thinking is also reflected in the value investment strategies of Buffett and Munger at Berkshire Hathaway. They make large bets on businesses they understand and predict will provide long-term returns. Extreme diversification only gives average returns.

“If you can identify six wonderful businesses, that is all the diversification you need. And you will make a lot of money. And I can guarantee that going into the seventh one instead of putting more money into your first one is going to be terrible mistake. Very few people have gotten rich on their seventh best idea.”
— Warren Buffett

But the power law is not just important to investors. It is important to everybody because everybody is an investor. An entrepreneur must decide where to invest her time, because she cannot diversify herself by running ten companies and hoping one of them succeeds. An individual can’t diversify his career by keeping dozens of equally possible careers to fall back on. You should focus relentlessly on something you are good at and will have value in the future.

# 3. Compounding

Compound interest is the eight wonder of the world
— Albert Einstein

A related model to the Pareto principle/power law concept is compounding. Compounding is the process by which interest is added to a fixed sum, which earns interest on the fixed sum plus the added interest and so on.

In A Random Walk Down Wall Street, Burton Malkiel gives an example of two brothers, William and James. William invests \$4000 every year between the ages of 20 and 40 for a total of \$80000. James invests \$4000 per year but between the ages of 40 and 65, for a total of \$100000. Despite investing \$20000 less, at an interest rate of 6%, William ends up with \$850,136 in his account, while James will have only \$219,242. This is because William allowed his money to compound for 25 years.

Interest eats night and day, and the more it eats the hungrier it grows. The farmer in debt, lying awake at night, can, if he listens, hear it gnaw. If he owes nothing, he can hear his corn grow.
— Robert G. Ingersoll

The power of compounding applies far beyond the realms of finance. Knowledge compounds rapidly, since the more you read and digest, the more connections you can make across different ideas and fields, leading to a prosperous snowball effect.

Part of the beauty of compounding is that it is neutral; it simply reinforces what exists. If you focus on improving just 1% a day, after a year you’ll be 37 times better than you were at the start. But if you get 1% worse everyday, you decline to nearly zero after a year. What starts off as a small win or loss compounds into something much more significant when you factor in time. Its only when you look back across the years that you see the true impact of your good and bad habits.

# 4. Activation Energy

Activation energy is a concept originating from chemistry. The idea is that reactions need a certain amount of energy to get going. Just having two combustible elements is not enough.

We all have an intuitive understanding of how activation energy works. For example, we know that putting one lit match to a log won’t start a fire, while using a flamethrower would be excessive. But there is a sweet spot in the middle which is just the right amount of heat required to get the fire going. This initial heat is the activation energy required for the reaction. As can be seen in the diagram below, reactions will only progress if the products are more stable than the reactants. In a fire, carbon in the form of wood is converted into a more stable form of carbon: carbon dioxide.

Despite being rooted in chemistry, the concept of activation energy applies beautifully as a practical mental model. It is often said that getting started is the hardest part of doing something. We all need that little push or spark of motivation to sit down and start working, studying or writing that elusive first novel.

Different people also have different activation energies. Take for example, university students who have been given an assignment. For some, simply being given the assignment information is enough impetus to start working on it. For others, it’s reading an interesting article related to the assignment or seeing that their friend has started a few days ago. And for the vast majority, it’s that impending sense of doom as the deadline approaches (as well as a few cans of Red Bull).

I came across an interesting example of activation energy when reading Atul Gawande’s book the Checklist Manifesto, in which he champions the use of simple, precise checklists in the medical profession. He argues that you can’t just throw together a bunch of nurses and doctors and expect them to perform well as a team. But simply including a checklist, which directs each individual to introduce themselves and designates time for the group to pause and communicate, provides the activation energy to form the bonds of teamwork and drastically reduces error rates and complications during operations

The most important takeaway from this model is understanding that we often need a little push to get things going. Understanding how to give yourself this added motivation can be an extremely powerful tool.

# 5. Leverage

Leverage is the ability to apply force in a way which maximizes output per unit of input. The formal discovery of leverage is credited to Archimedes who famously stated:

Give me a lever long enough and I shall move the world

Many engineering feats have been accomplished through applied leverage. The Ancient Egyptians used levers to lift stones that weighed up to to 10 tons to build their pyramids and obelisks. We use levers everyday, in scissors, wheelbarrows, crowbars, pliers and even our jawbones.

Again, the concept of leverage applies practically, because it tells us that small, focused changes can make huge impacts on our lives. In this way, leverage is at the core of the popular piece of advice: work smarter not harder.

We can understand this better by referring again to the power law which suggests that 20% of the work produces 80% of the results. But, the secret is in the vertical part of the exponential curve below. In fact, 1% of work produces 50% of the results.

This suggests that focusing our time on the things that really matter is how we will get the best results, rather than taking a shotgun approach. That is, before you build a smart robot that can pick the highest apples in your orchard, check for low-hanging fruit.

Leverage is also an extremely important concept in the business world. Anyone who has haggled in a street market intuitively understands this. You have to act like buying the vendor’s product is a favour you’re doing for them, rather than something you really want. Then you give them a lowball offer and wait for them to raise the price to something that’s still a bargain before making the purchase.

Don’t ask the barber whether you need a haircut
— Warren Buffett

Having monopoly power in a market provides huge leverage. That’s why food is so ridiculously overpriced at stadiums and music festivals. You’re already there and don’t really have any other choice. Same goes for airline food and the rates charged by specialist surgeons and doctors. In Peter Thiel’s book, Zero to One, he argues that achieving monopoly status is the only real way to grow rapidly scaling and wildly successful businesses. Google is effectively a monopoly in the online search market. Amazon is effectively a monopoly in online retail. Having this power gives them the leverage to set prices and capture a large proportion of the value they create.

# 6. Regression to the mean

Regression to the mean was initially discovered in 1886 by Sir Francis Galton, a renowned polymath and half-brother of Charles Darwin. The rule is that in a normally distributed system, large deviations from the average will tend to return to that average with an increasing number of observations. This can often lead us to falsely attribute cause-effect relationships, such as when a patient gets better after taking herbal remedies or a tennis player’s performance improves after being berated by her coach.

Daniel Kahneman explores this concept (as well as a host of other logical errors and psychological biases) in his excellent book Thinking Fast and Slow.

Kahneman had a ‘eureka’ moment when he was teaching flight instructors in the Israeli Air Force about how rewards for improved performance work better than punishment of mistakes. A senior instructor disagreed, saying that whenever he praised cadets on their flight skills they would perform worse the next time, and when he criticized cadets for bad execution, their performance would immediately improve. Consequently, he rarely praised any students and was always ready to dish out criticism.

Kahneman instantly recognized that this was a perfect example of regression to the mean. Naturally, the instructor only praised cadets that performed far better than average. But the cadet was probably just lucky on that particular attempt and his performance would have deteriorated regardless of whether he was praised or not. Similarly, cadets were only criticized when their performance was far below average and thus likely to improve regardless of the instructor’s actions.

To illustrate this, Kahneman drew a target on the floor and asked the instructors to turn their backs and throw two coins at it in immediate succession. In most cases, those who had done best on the first throw deteriorated on their second throw and vice versa. This is exactly what was happening with the flight cadets.

## What to do about it

An easy way to correct for regression to the mean (particularly when conducting experiments) is to use control groups. For example, when testing the positive effect of energy drinks on depressed children, the control group would be depressed children that didn’t receive an energy drink or even better, received a placebo (i.e. told to drink an energy drink that wasn’t actually an energy drink). The control group improves by regression to the mean alone, which allows the experiment to determine whether the energy drink benefited children more than regression can explain.

Our minds are strongly biased towards manufacturing causal connections (see narrative fallacy). Our associative memory looks for an easier explanation than simple regression to the mean. Thus, when interpreting data and predicting performance, we need to keep this mental model in mind to make sure that we do not make logical errors. Fortunately, awareness of regression to the mean is a great first step towards understanding the role of luck in success.

This is why we should always look out for track records rather than one-off successes or failures. An entrepreneur who has just sold off a multi-million-dollar company will likely not do so well in her second venture. But an entrepreneur who consistently produces innovative and world-changing ideas over long periods of time likely does have that extra element of skill.

# 7. Availability Heuristic

The availability heuristic, like other heuristics of judgement, substitutes one question for another: instead of estimating the frequency of an event you report an impression of the ease with which individual instances come to mind.

For example, if you ask the friend of a recent divorcee about the rate of divorces for those over 50, they will report a much higher figure than someone whose friends are all in stable long-term marriages. In this way, the availability heuristic is not a model you want to implement, but one you want to recognize and avoid.

A famous study of availability suggests that awareness of your own biases can lead to happier relationships. In the study, spouses were asked, “How large was your personal contribution to keeping the house tidy, in percentages”. As expected, the combined percentages added to over 100%. Both spouses remembered their chores and and efforts much more saliently, leading to an increase in judged frequency. The same pattern applies for any group work, as most members feel that they have done more than their fairly allotted share.

This is one of the few systematic errors where there is high potential for successful ‘debiasing’. Being aware of the dangers of anecdotal evidence is a good first step. Relying on statistics and numbers also helps. For example, where several people feel that their efforts are not being recognized, simply observing that there seems to be more 100% of the credit to go around may be enough to defuse the situation.

In the 1990s, a German psychologist Norbert Schwartz conducted a study to find out what was more important in determining estimates of frequency: the number of instances retrieved or the ease with which they come to mind. Participants were asked to list instances of when they had been assertive and then rate their overall assertiveness. Some were asked to list six instances while others were asked to list twelve.

Interestingly, he found that those asked to list twelve instances rated themselves lower in assertiveness. This is because recalling the first few instances was easy, but the drop of fluency between six and twelve suggested that the participant was not assertive overall. In this way, the availability heuristic that the subjects applied was more of an ‘unexplained unavailability’ heuristic. Even more interestingly, when some irrelevant explanation was given as to why fluency dropped, such as music playing in the background, participants rated themselves equally assertive when they retrieved twelve instances as when they retrieved six.

The media has a huge role to play in what is top of mind for the population. In a world so saturated with information, news outlets only report about long-tail, highly improbable events simply to grab our attention. No one watches a reporter talking about drab day-to-day stuff for half an hour, but they tune in to highly sensationalized stories about plane crashes, mass shootings and disease that make it seem like the world is falling apart. This can lead to an ‘availability cascade’, where a media story causes a powerful reaction, which then becomes a story in itself, prompting further discussion. Terrorists are the best practitioners of availability cascades. Deaths in plane hijackings are dwarfed by traffic deaths, but the gruesome images in the media are recalled far more quickly, making it difficult to apply reason. That’s why I thought it was extremely commendable that New Zealand Prime Minister Jacinda Ardern refused to mention the name of the gunman in the recent shootings in Christchurch.

The media’s sensationalist tendencies also mean that the most extreme and radical views get the most airtime. During the 2016 election, Trump presented us with a vivid narrative of Mexican immigrants bringing drugs, crime and rape. This led to political analysts spending more time discussing Trump’s views, which further cemented them in the minds of the American population. Come election day, voters were more easily able to recall what Trump stood for and make judgements on whether or not to vote for him.

# 8. Margin of Safety

The margin of safety model has its roots in the disciplines of engineering and quality control.

Consider a car part that is crucial to the functioning of the engine. Let’s assume that the part is replaceable and lasts on average for 100 000 kilometres of driving. So at what point do we replace the part? The most cost efficient answer would be at 99 999 kilometres. But we don’t know anything about the 100 000 kilometres the part has gone through. Were they 100 000 kilometres of rough terrain that wore out the part much quicker than expected? Or was the car just cruising along highways the whole time? We also don’t know anything about the manufacturing process and any flaws in the 100 000 kilometre estimation.

That’s why we start thinking of replacement at 70 000 kilometres and put a hard stop at 75 000 kilometres. That spare 25 000 kilometres is the margin of safety. Sticking with the car theme, that’s also why you can keep going for a fair distance on an ‘empty’ fuel tank (check out Hamish and Andy’s experiment here).

I’ve found a useful application of the margin of safety model when scheduling time to accomplish a particular goal or task. If an assignment, work task or personal goal needs to be accomplished by a certain date, I try and get it done a couple of days in advance. This provides enough leeway to get the work done should something unexpected throw a spanner in the works. Planning for the unpredictable helps you deal with it when it arises.

Clearly the margin of safety model is very useful both theoretically and practically. But it does come with the tradeoff of time and money. If we replace the aforementioned car part at 10 000 kilometres (and all similar parts), all cars would basically be fail-proof. But everyone would have to pay \$70 000 for a Camry. We’ve made the wise decision to accept a small chance of error so that we can all afford to drive around in the cars we want. Clearly, it is easy to see how pushing the margin of safety model beyond its limits completely nullifies its usefulness.

# All models are wrong

There could be just one tiny hiccup with the process of building your arsenal of mental models.

All mental models are wrong.

By definition, a model is a representation and simplification of reality rather than reality itself. When you try and boil down the messy complexities of the world into a principled framework, there are always going to anomalies and exceptions to the mental systems you have built.

“All models are wrong but some are useful”
George Box

But the real question here is how wrong do models have to be to not be useful? Even though they are simplifications of reality, we need models to anchor our understanding of the world around us. When we better understand the world around us, we can make better decisions using versatile principles.

Models are helpful because they apply in a wide range of situations, not all situations. Weather forecasts often get it wrong, but that doesn’t mean we shouldn’t take an umbrella if it’s predicted to be raining. City maps are often riddled with errors and missing streets, but roaming a foreign city without one would be foolish. We should prize usefulness and functionality over laser precision. In a world where only the perfect was tolerable, we wouldn’t have any technology, art or science because everything is inherently uncertain. Humanity only progresses through relentless experimentation and research that irons out the flaws in previous iterations.

“Scientists generally agree that no theory is 100 percent correct. Thus, the real test of knowledge is not truth, but utility. Science gives us power. The more useful that power, the better the science.”
— Yuval Noah Harari

There will always be situations to which a particular mental model doesn’t apply. That’s why you should strive to keep adding to the list of mental models at your disposal, to ensure that you have a deep understanding of models that cross multiple disciplines. This way, you will be able to pick and choose the models you need to survive and thrive.

“It is important to view knowledge as sort of a semantic tree — make sure you understand the fundamental principles, i.e. the trunk and big branches, before you get into the leaves/details or there is nothing for them to hang onto.”
— Elon Musk

Written by