Background Ops #11, Finale: Explore/Exploit Tradeoffs

Sebastian Marshall
The Strategic Review
20 min readJan 18, 2018

WHAT DOES THE $500,000 PERSON DO THAT THE $50,000 PERSON DOESN’T DO?

“What does the $500,000 per year person do that the $50,000 per year person doesn’t do?

You look from outside and study those two individuals, everything seems to be the same. They both are the same sex, same age, have the same training, the same positions, the same contract, the same fringe benefits… both are successful, they work hard, they’re good family people, make tough commitments…

But what’s the difference? What does the $500,000 per year person do that the $50,000 per year person doesn’t do?

He pays the price a little bit more. He works harder a little bit more. He makes money a little bit more. He saves money a little bit more…”

— Art Williams, “Do It” speech, 1987

***

TSR’S SERIES ON BACKGROUND OPS, FINALE: EXPLORE/EXPLOIT TRADEOFFS

This is our final issue on Background Ops. Over the last 10 weeks, we’ve explored a simple idea with profound implications —

“It is a profoundly erroneous truism, repeated by all copy-books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them. Operations of thought are like cavalry charges in a battle — they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.”

The mathematician Sir Alfred North Whitehead wrote that in 1910. As we discussed in Issue #1: Strict Limit, time is the last and fundamental limit of our lives — we each get 24 hours every single day, 168 hours every single week, and everything we want to do on this planet has to fit into those blocks of time.

As such, we can’t just “think more” — just the opposite. If we want to accomplish progressively more, we need to build background operations that get the work done for us without conscious thinking or action, in order to free up our time to do more and greater things.

This seems all rather inarguable — and it’s obvious as to how it applies to people who already have great things happening in their lives. If you’re already running a business or another organization, if you’re happy in your career, if things are already going well for you — in that case, it’s very clear how these principles apply.

But what if you’re still not sure exactly what you should be doing?

This is a harder question — a question that, in my opinion, is typically answered poorly.

When we don’t know exactly what we want, should we just behave haphazardly?

Unsurprisingly, I think — not.

So let’s explore, in this final issue of Background Ops, how to be methodical in getting what you want if you already know what that is — or if you have no clue what you ought to be doing just yet.

***

EXPLORE/EXPLOIT TRADEOFFS

So, what should you be doing with your life? How to design the operations and systems in your life to ensure that you get what you want?

To get at this very important question, we need two data points.

The first is explore/exploit tradeoffs. This explanation from Conceptually.org will suffice —

“The exploration/exploitation trade-off is a dilemma we frequently face in choosing between options. Should you choose what you know and get something close to what you expect (‘exploit’) or choose something you aren’t sure about and possibly learn more (‘explore’)? This happens all the time in everyday life — favourite restaurant, or the new one?; current job, or hunt around?; normal route home, or try another?; and many more. You sacrifice one to have the other — it’s a trade-off. Which of these you should choose depends on how costly the information about the consequences is to gain, how long you’ll be able to take advantage of it, and how large the benefit to you is.

As well as happening in everyday life, this situation arises often in computer science, where the term originated.”

The decision between exploration and exploitation is one that all of us face constantly — though we often don’t realize it.

Let’s say you just graduated college or moved to a new city. Right away, you get a job offer from a company that would be good-but-not-great to work at.

That job offers comes in on a Friday and they ask you to start on Monday. Meanwhile, you’ve got some jobs you’re interested in, but the timing for them reviewing your resume and interviewing you would be at least a few weeks.

Should you accept the good-but-not-great job?

Of course, the answer is — it depends. On a lot of things.

But more important than if you take the job or not is recognizing the universal class of problem this is.

To turn that job down and stay on the job market will cost you at least a few weeks of wages — and with no guarantee there’s a better job out there. If times are particularly tough economically, you might not even get a job offer as good at all in a few weeks.

So, should you explore the job market? It’ll get you more information and you might wind up in a better situation. But there’s a guaranteed cost to exploring more — a few weeks of lost wages — and no guarantee of a higher payoff.

Or should you exploit the current best offer, saying yes, and accepting wages at the good-but-not-great job?

This is incredibly common. It applies in situations as trivial as choosing whether to order a dish you already know you like at a restaurant to choosing a new dish you’re unsure if you’ll like.

It applies in “medium-sized” decisions — if you’re doing academic research, should you continue on a line of research that’s doing decent academic work and getting papers published and the occasional citation, but which isn’t unique or especially valuable? (That would be exploiting the current line of research further.) Or should you strike off into a new direction that might not yield any gains at all? If so, it’ll be more expensive getting into the new area of research, with uncertain gains. You’ll take the exploration cost of seeing if there’s anything in that domain, and the uncertain payoff of potentially larger gains — or potentially no gains at all.

And finally, this applies at the largest-scale life decisions — what work to do, what career to pursue, even the largest-scale philosophical questions of what really matters in life.

At any given time, you can be doing some mix of exploring for new and better things in life, and exploiting the best opportunities you have available to get the known gains from doing so.

***

FATIGUE AND BOREDOM AS AN EVOLUTIONARY EXPLORE/EXPLOIT PROMPT

For our second data point, I’d like to call your attention to two research papers that were shared with me by Kaj Sotala. I’m incredibly grateful to Kaj, because these two papers produced something of a landmark shift in how I understood my emotions around negative affect. Here’s one paragraph from each.

An opportunity cost model of subjective effort and task performance” —

“Finally, we note that a framework explaining changes in mental effort and task performance as the result of dynamic learning processes can easily be expanded to incorporate trade-offs between exploration and exploitation. Even when the perceived utilities of the two best tasks are stable, it could be adaptive for there to be a small bias away from continuing to allocate processing capacity to the same task over time, which would also contribute to decrements in performance over time. As discussed extensively in the literature on reward learning (Cohen et al. 2007), such an exploration bonus would trade off exploitation of knowledge about the current task for gaining new and potentially valuable knowledge about different tasks.”

Why self-control seems (but may not be) limited” —

“Frequently studied but often misunderstood, fatigue plays a central role in the regulation of behavior. Moreover, similar to self-control, influential treatments of the topic have advanced a serious misrepresentation — that fatigue is caused by a loss of energy following excessive work. According to this account, mental energy is the basis of motivation and action, such that when resources are lacking people are less able to pursue their current goals or initiate new activities. As it turns out, however, this viewpoint is fundamentally flawed. After an extensive review of the psychology of fatigue, Robert Hockey proposed a motivational control theory of fatigue, whereby fatigue does not affect task performance through a failure of energy, but by changing the selection and control of goals. Fatigue, according to this view, is an emotion that interrupts current behavior such that alternative options can be entertained.”

Are you still with me? Did you fall asleep trying to read those paragraphs?

Do please go back and re-read them if you’re skimming; the takeaway there is enormous if it’s even partially true.

I’d encourage you to read both papers in full since the implications are really very significant, but I confess they’re somewhat tedious and technical reading.

The main idea is that the negative emotions of boredom and fatigue might be evolutionarily-adaptive prompts to shift from known-and-works behavior into greater exploration of new possibilities.

Now, slow down for a second — that’s enormous if even partially true.

***

BOREDOM AND FATIGUE PROMPTS

Let’s start with a really basic and slightly strange example — imagine for a minute that you’re a grazing animal. Like, a cow. You’re not domesticated; you’re a wild cow just ranging around eating grass and doing whatever cows do when they’re not eating.

If you come to a patch of okay-but-not-great grazing grass to eat, how long should you stay there and eat?

If there’s not much grass and it’s a lot of work to eat it — maybe it’s very patchy with bad soil and a lot of rocks — then should you stay and eat this mediocre-quality grass, or should you go search for better grazing pastures?

A cow that always stayed and ate bad-quality grazing would experience some risks. A cow that always kept searching if they didn’t find the best-possible grazing would experience some risks.

Evolution working how it works, cows that stayed put and ate their fill when very hungry would tend to better… but if you only stayed put, you’d miss better resources whenever a short walk over the next hill might produce some.

Hence — boredom. I don’t know if cows “get bored” as humans understand the concept, but something like a prompt to tire of an activity producing consistent and steady rewards in favor of searching out better rewards makes a more robust and more successful cow.

When you’re very hungry, you should stop and eat whatever is there. After you’ve eaten to the point you’re satiated enough, you should search somewhat to see if there’s better food available for the next time you eat.

As with cows, so too with humans. After intense bursts of effort on one particular task, we experience mental fatigue and boredom. Those two linked papers suggest that the emotions of boredom and fatigue evolved partially to prompt humans to go explore more instead of just taking advantage of whatever stable set of gains are already on hand.

***

HANDLING BOREDOM THE WRONG WAY

The magic of boredom and fatigue is that they’re unpleasant.

Similar to how signals of physical pain teach a young child very quickly not to touch a burning stove, boredom and fatigue highly motivate the person experiencing them to do something else in order to escape from the negative emotions around them.

But there’s a gigantic problem here — as we discussed in Background Ops #2, we live in an era of unparalleled new distractions and addictive behaviors.

It’s never been more possible to rapidly eliminate boredom… without fundamentally improving one’s life.

If you read the research papers above and agree even partially that boredom and fatigue are evolved prompts to explore more, this then suggests that there could be better and worse ways of exploring to find more fulfilling and thriving patterns of life.

***

OPERATIONALIZING PATTERNS OF EXPLORATION

Let’s start with a simple prediction — if you were significantly bored last week, you’ll probably be significantly bored sometime this week too.

If you experienced mental fatigue yesterday, you’ll probably experience mental fatigue today too.

Recognizing and acknowledging boredom and fatigue is critical to having a thriving life.

A lot of people — as crazy as this might sound — seem to not recognize that they are regularly experiencing boredom and fatigue, and instead deal with each instance of it on a case-by-case basis.

Usually, sadly, via junk food, junk leisure, internet surfing, addictive games, mindless shopping, or similar behaviors.

This brings us to the first piece of guidance —

If you’re regularly experiencing significant boredom or fatigue, you should design and operationalize what you’re going to do the next time it hits.

Almost by definition, your judgment is at least slightly compromised when you’re in a significant negative state. If you’re feeling mentally foggy or “bored out of your mind,” this is likely not when you’re at your smartest and most effective.

Thus, you should periodically look to curate and set up exploration-type activities to do for leisure, recharging, and learning when you’re in a maximally healthy state.

This can be as simple or as complicated as you want it to be. Roughly every three months or so, I’ll spend a few days researching new activities, technology, and creating myself a reading list, download a number of audiobooks onto my iPhone, download a number of podcasts, and possibly line up some documentaries for watching, or online courses to take on a site like Skillshare.

There’s additional challenges to actually doing the healthy behavior when you’re in a low mood, which we’ll address, but just giving some thinking to what behaviors would best serve you goes a long, long ways.

***

ON LARGER SCALES FOR CAREER AND LIFE GROWTH

The bad news is that there’s more addictive, louder, more engaging distractions than ever more.

The good news is that there’s more healthy, effective, incredibly powerful activities available at our fingertips than ever before.

But — the latter category requires slightly more design and thinking things through in advance.

My friend Carlos Miceli has had a terrific career so far, and it keeps getting better for him every year.

He’s Argentinian, and if you’ve followed what’s happened in Argentina over the last three decades, you know he’s had somewhat of a rocky ride.

Starting our in Buenos Aires with no connections or network to speak of and minimal marketable skills, Carlos recognized two things about his life —

1. He was going to need to know more interesting and exceptional people in order to build the life he wanted.

2. He was going to need to master the English language to unlock more opportunities for himself.

He thus designed and operationalized a fascinating program of exploration and self-improvement — he set out to have a single Skype call every single day with an interesting person somewhere in the world, primarily in English.

Back in 2011, Carlos wrote about this

“The problem with most people’s productivity is their lack of systems in their daily lives. They trust their memory. They trust their willpower. They trust their energy. All recipes for disaster. We forget things, our willpower is limited, and our energy gets depleted. This is normal human behavior. The solution is not to power through our human flaws, but to structure our lives in a way that we avoid them completely.

When you apply the right systems in your daily life, things get done with less effort, faster, and better. You accomplish more. And when you accomplish things, you have more time to accomplish even more. This is the number one “secret” behind winners: systems.”

I don’t know how long it took Carlos to figure out the mechanics of reaching out to people, finding a window for a call, scheduling it onto the calendar, and adding the new contact on Skype.

If I was to hazard a guess, I’d guess it took him 15 hours to work out all the details. But maybe it was slightly lower than that, or maybe it was even as much as 50–100 hours.

Regardless, that initial investment into thinking through how to get in touch with people and have an interesting first call led him to have some fantastic business and social opportunities in Mexico, the United States, and Europe.

Most of us get a little more tired in the afternoon and evenings, and do less good work than in the morning. Carlos spent the time to analyze what would benefit him the most, and was able to schedule literally hundreds of calls with interesting people into his afternoons and evenings. Instead of goofing off on Farmville or surfing the web, he built hundreds of connections, leveled up his skills, and got his English to a native speaker’s level very quickly.

The same process can be applied to any sort of exploring type behavior — if you’ve ever followed along a product designer, illustrator, game designer, or coder who makes a new prototype, illustration, game, or application either once per day or once per week, you’ve seen someone who was able to level up their skills incredibly quickly.

People have used similar lessons to learn foreign languages, develop writing skills, or… really, anything they want.

You want to proactively design and operationalize your “exploration time,” so that the ways you fight off boredom and fatigue are life-affirming, joyful, and lead to a more thriving life.

***

GUIDELINES FOR DESIGN OF EXPLORATION

Three guidelines —

1. Be aware of the maximum sustainable pace and stick to no more than that.

Back in the TSR Celerity series, we discussed the tradeoff between maximum sustainable pace and maximum possible pace. (That piece isn’t online, but if you want the long version, email me and I’ll send it to you.)

The concept is easy enough to understand, though — at any given time, we can “red-line” and do a lot more activities than normal without fully healing or recovering from them.

It’s debatable whether you should ever red-line or not — but by definition, you can’t perform at more than the max sustainable pace for infinitely long periods. As you install new and explicit systems for exploring, you should ensure that you’re recovering adequately mentally and physically.

As a general rule of thumb, a well-organized person can probably safely schedule around 60% of the work they could do in a maximal week. If your healthier behaviors for learning and exploring aren’t as fully rejuvenating as pure leisure, bear that in mind and make sure you don’t take yourself over the maximum sustainable pace for an extended period of time.

2. Keep psychological reactance in mind.

Read up on psychological reactance a little bit. Wikipedia

“Reactance is a motivational reaction to offers, persons, rules, or regulations that threaten or eliminate specific behavioral freedoms. Reactance occurs when a person feels that someone or something is taking away their choices or limiting the range of alternatives.”

In short, anything that you perceive as reducing your freedom has the possibility to generate reactance, or a strong desire to “fight back” against whatever is constraining your behavior — and failing the ability to re-gain freedom in that domain, to “act out” a little bit to assert your freedom in other areas.

Reactance is normal and something that must be navigated. You should expect that new behaviors that are slightly uncomfortable at first will prompt reactance, and that you’ll have to navigate it safely and sanely. Give yourself extra time and under-schedule yourself if you’re going to be installing a particularly difficult-at-first new behavior. Be ready for reactance to hit. It still hits me whenever I launch a new improvement program, and I often have a bad week or two immediately before “going on a run” of peak performance and improved behavior. Watch out for and be ready for reactance — it’s not a huge problem, just something to anticipate and prepare for when looking to adopt new behaviors.

3. Liberally use “hard rules” to make behavior change easier.

We covered Hard Rules two weeks ago in Background Ops #9, so we need not rehash it here.

Just note it’s paradoxically much easier to “have a Skype call every day with someone interesting” than it is to “try to get to know more people.”

“Write one blog post every day” (or week) is paradoxically easier than “write more often.”

Hard rules are your friend, especially at times of greater fatigue or boredom or other negative affect.

***

AND SWITCHING TO EXPLOIT MODE

Every now and then, you find the perfect circumstances for you — a new emerging technology that particularly speaks to you, a skill that you’d benefit from cultivating to a high level of mastery, a business idea that gets really great resonance and immediate traction from the market.

Every so often, we have opportunities that are really rare and precious. These types of opportunities sometimes only present themselves a couple times per decade, but if maximally seized, can really change one’s life and lead to incredibly fast growth in life.

My friend Taylor Pearson — most definitely not a communist — is fond of quoting Vladimir Lenin on this topic:

“There are decades where nothing happens; and there are weeks where decades happen.”

There’s an art to spotting when these types of opportunities are unfolding, and rapidly focusing in and cutting everything else to maximally exploit those opportunities.

Most of Background Ops is geared towards when you have an opportunity you want to maximally exploit — there’s always some value in things like codifying universal principles, identifying value-producing work and eliminating everything else, and building operational consistency… but those activities truly shine when you’re on an opportunity that you can maximally take advantage of.

In times of very high opportunity, you’ll likely want to reasonably minimize your leisure, and fill the remaining leisure time with maximal recharging activities and maximally motivating activities.

During these time periods, I think it makes a lot of sense to stack your social life with conversations and meetings with people who are really exceptionally thriving — for my part, one of the smartest things I did at the end of 2017 was setting up an every-two-weeks business and marketing call with Miguel Hernandez, the brilliant head of Grumo Media and founder of the new startup Mench.

Every time I talk to Miguel, I get 2–3 days of serious motivation. Miguel and his cofounder Shervin Enayati are always achieving a tremendous amount, and whenever I talk to Miguel, I get a kick in the metaphorical pants to do more myself.

(Heck, now that I’m writing this, I might look to schedule another 1–2 regular calls along the same lines. They’re really that powerful.)

Likewise, there’s a variety of books and speeches that are motivating — whereas I prefer to read technical books during “explore mode” periods, I think the time for constant motivation is when everything is going right and the gains are all there for the taking.

To that end, I can’t recommend Art Williams’s “Do It” speech enough. The short version of it is five minutes, I really do highly recommend it.

The first time you listen to that talk, it’s simply motivating. But if you listen more carefully, you actually pick up many of the principles of focus — “…pay the price a little bit more, work harder a little bit more, believe it a little bit more…”

(The long version is great, too. There’s some jabs at liberalism in there — Williams is definitely a conservative — but it’s still very much worth it.)

***

GUIDANCE

Dense stuff, eh?

To recap —

1. The world is constantly full of “explore vs exploit” tradeoffs — whether you should take known gains, or take the exploration cost to see if better options are are available.

2. Oftentimes, people don’t realize that they’re making explore/exploit tradeoffs. Once you know this mental model, you start noticing these tradeoffs everywhere, and can make the choices more explicitly.

3. Boredom and mental fatigue are very possibly evolutionary-evolved prompts to switch from exploit mode to explore mode.

4. In an era of addictiveness and more options than ever before, we’ll benefit very heavily from explicitly designing and operationalizing how you spend your exploration time — instead of just spending it on Farmville or internet surfing.

5. Seriously think through what would maximally benefit you from your exploration time — keep in mind Carlos Miceli who, from Argentina with no connections to start, was able to build hundreds of connections and master fluency in the English language with a simple daily Skype call with someone around the world he found interesting.

6. Keep in mind maximum sustainable pace, reactance, and the usefulness of hard rules when designing your exploration time.

7. When large opportunities hit, switch to exploit mode maximally — go hard and take advantage of the gains. Minimize leisure within reason during those periods, and fill up the remaining leisure with maximal recharging and motivation.

Finally, I can’t recommend highly enough getting on a Keystone of some sort. If you’re aiming to have a Skype call daily, you want to track that somewhere. If you’re looking to study new topics, or practice learning a language, or creating something new every day or every week, you need to track that somewhere — I use a Lights Spreadsheet, but any keystone could do. You’ll want some keystone, though. I think it’s even more important to be on a keystone during exploration time. Read Background Ops #2: Keystone if you haven’t yet for full details.

***

SPECIAL THANKS

Whew.

This has been our first series on Medium.com, and I’d like to stop for a moment and say a big thanks to all those people who helped make it a big success.

I super appreciate the people who commented on multiple issues — Donny Kimball, Matt Cartagena, Greg Nance, KimSia Sim, Donnie Lee, and Chiara Cokieng.

James Mawson and Andrew Kendall both shared more resources and research with me — James on research from the Soviet archives, and Andrew about industrial process models.

A big thanks as well to The Browser which introduced thousands of new people to The Strategic Review, which is a very cool publication I’ll be checking out more going forwards, and to Austin Brawner from Ecommerce Influence who very graciously had me on his show along with Andrew Foxwell, and who said some very nice things about TSR — I was super happy to hear it. (Check out that episode, if you haven’t yet.)

A big thanks to a number of people who have pushed me to level up my writing and get over to Medium over the years — especially Zat Rana, Taylor Pearson, and James Clear. Thanks for all the encouragement and kind words from Phil Hodgen and Zach Obront, too.

Last but not least, of course, a big thanks to my brilliant business partner Kai Zau. While I’m the primary author of TSR, he’s almost as much a coauthor as me — we constantly thresh out the nature of reality together and put it into practice. The excellent design on the purple images and all the cool tech we launch is all him, but more than that, I see every piece as tracing the lines of thought we constantly explore together. He’s absolutely a prince of a guy, and I feel uniquely fortunate I get to work with such an exceptional person every single day.

***

AND THANKS TO YOU

Background Ops has been a unique pleasure to write — I hope you’ve gotten at least half as much enjoyment and lessons out of reading it as I have out of writing it, because it’s been immense.

If you want to look back on any particular issue, here’s the list —

1. Strict Limit
2. Keystone
3. Entrainment
4. Value-Producing Work
5. The Nature of Operations
6. Strength and Weakness
7. Universal Principles
8. Operational Consistency
9. Hard Rules
10. Creative Processes
11. Explore/Exploit Tradeoffs

Do stop and think through whether you’re primarily in explore mode or exploit mode — and design and build the corresponding systems into your life to get the relevant gains.

For the last time, Sir Alfred North Whitehead’s observation —

“It is a profoundly erroneous truism, repeated by all copy-books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them. Operations of thought are like cavalry charges in a battle — they are strictly limited in number, they require fresh horses, and must only be made at decisive moments.”

As with civilization, so too with individual lives — get everything running elegantly in the background, and you’ll find yourself going up rapidly in life.

This has been quite the journey. For joining me on Background Ops — thank you.

Sebastian Marshall
Editor, TheStrategicReview.net

###

This is the end of Background Operations. It’s been a unique pleasure. If you haven’t gotten on a keystone yet, read Background Ops #2 — it might be life-changing for you.

Want to try out a Lights Spreadsheet? We’ve got templates and a best practices guide for you —

https://www.ultraworking.com/lights

Subscribed to TSR yet? This is the last issue of Background Ops, but the journey continues next week with Issue #1 of our next series, Unity.

You can subscribe for free and get an actionable, long-form essay every Thursday:

http://www.thestrategicreview.net

Is Unity going to be all boring and kumbaya-singing? Well, let’s see, who is making an appearance during Unity

“I am the punishment of God… If you had not committed great sins, God would not have sent a punishment like me upon you.” — Genghis Khan

Perhaps not! It should be interesting, eh? Thanks for all your readership and the great dialog over this series — and stay tuned to The Strategic Review for Unity starting next week.

--

--