What the Math says about Life, Effort, and Failure in 2019…

Aditya Tyagi
Analytics Vidhya
Published in
4 min readJan 1, 2020

As the year (or more dramatically the ‘decade’) comes to an end, you might find yourself asking whether the past year/decade was overall positive or negative for you. A bunch of low points may come to mind, leaving you gloomy, only to be consoled by (hopefully more than a few) moments where you felt elated: you may have gotten terrible grades, rejected from a dream school/internship/job, or been passed over for a promotion. You might even have been passed over by a romantic interest! In the same vein, a new passion may have found you (dance, yoga/fitness for me), reached a milestone (graduation), or even embarked on new adventures in uncharted territories (grad school/Chicago for me). Despite the positives, you might find yourself dwelling on all your ‘failures’.

However, I’d like to share with you a more analytically inspired way of viewing success — using the statistical technique of regression.

‘Modern’ regression in a picture c. 2019

Most of us are familiar with regression models — fitting lines through points in the hopes of predicting the future — an activity our caveman ancestors likely attempted in their never-ending quest to find the next bison herd.

Regression modelling c. 10,000BC

While assessing how ‘successful’ are regression model is, we often use a metric called R² (pictured below), and prefer models with higher R² rather than lower ones.

How does R² apply to life?

And here is the punchline for the philosophically inclined caveman:

In life (as in regression), rather than measure how often we succeed/fail (‘y’ dependent variable), we should try to maximize our R² value: “how much of the dependent variable were we able to influence using factors under our control?”

How does this apply to something we’ve all done at some point in our lives (and dreaded) — preparing for an exam?

Rather than decide whether we succeeded or failed based on how high we scored on an exam, we should decide based on how many of the variables that we were able to control did we actual control:

  • Did we spend adequate time reading the textbook? (can control)
  • Did we at least try to solve some of the homework problems? (can control)
  • Did we not fall asleep in the front row of the lecture hall (like I frequently do)? (can control — this is debatable) ;)
  • Did we take a safe route to campus rather than the dangerous one? (can control)
  • How hard were the questions the professor decided to ask on the test? (cannot control)
  • How everyone else performed? (cannot control)
  • Whether it rained on the way to class, and your calculator malfunctioned (cannot control)

And this is the message I’d like to leave you with as the decade draws to a close: a person succeeds/fails in proportion to the degree to which he is able to influence factors that were under his control, and not what his final outcome was. So, always increase your R² in life. :)

— — — — — — — — — — — — — — — — — — — — — — — — — —

A (optional) statistical footnote for the mathematically sophisticated caveman. The regression story in four bullet points:

  • We try to predict a dependent variable (the number of bison in a herd), using independent variables (today’s temperature, the number of clouds in the sky, the amount of grass nearby, etc.)
  • We assume the world works in the following way:
y is the dependent variable (# of bison in herd); the beta’s are fixed, and the x’s are independent variables that can change (sunlight, temperature, etc.); the squeaky ‘e’ is some random quantity
  • Only some “higher beings” (divine entity, aliens, etc) know what the true beta values are, what value the random ‘e’ term will take, and use this to decide how many bison will be in a herd. Unfortunately, those higher beings are not particularly agreeable to sharing those values with us. Luckily, our tribe’s priest/shaman/druid is able to do some magic and estimate the betas.
  • Since the dependent variable can vary due to only two reasons: (1) changes in the ‘x’ values; (2) changes in the ‘e’ term, a strong model should be able to attribute “as much change as possible in the y variable to changes in the x variable”. Mathematically:
y with a line on top is the average of y, y with a ‘hat’ on top is the values we predicted using the x variables
y without a hat on top is the actual value of the dependent variable, and y with a line on top is the average of y (as before)
The higher this number, the more useful our bison predicting model is (called the R² or coefficient of determination)

--

--

Aditya Tyagi
Analytics Vidhya

I like dance, data, reading, and telling great stories. I make memorable observations about life & everyday experiences. I’d like to share them with you.