Jedi Lessons in Analytics — The Force

Episode 1 — Perspectives in Data Science

Published in
6 min readJun 5, 2017

--

Curious, you are.

Before we begin, let’s get this out of the way:

  • Han shot first. Actually, Greedo had no chance to shoot at all.

This is not about one movie or one series. I’m going to teach you to look at that long ago, far away galaxy in a new light. Or darkness. Whatever feels right.

Now…

The Force Defined

You have much to learn. Let’s start with this statement, repeated in various forms throughout the Star Wars films…

It [the Force] is an energy field created by all living things. It surrounds us and penetrates us; it binds the galaxy together.

Sound like anything else you know? How about… data?

Maz gets it, and she’s not even a Jedi.

Data surrounds us. At the same time, we are constantly creating it. Every action, every decision, every thought creates more data that exists forever.

Every move you make, every step you take, every claim you stake…

Sorry, I got a little distracted there. But yeah, somebody is ‘watching’ us in terms of tracking each piece of data we generate.

Pictured: totally not a Jedi. Or even a Sith.

With one minor alteration, even Agent Smith seems to get it.

It is data that connects us, data that pulls us, that guides us, that drives us. It is data that defines us, data that binds us…

And that dude was literally made of data. I crossed stories, but try to deny the sentiment.

Forgive the movie-hopping, but that dialogue fits my point too well to ignore.

Point being, data surrounds and empowers us no less than the Force. Its use can be just as rewarding or just as dangerous. Or both.

Light and Dark

Data doesn’t really have a ‘light side’ and a ‘dark side’. Ever since I first saw Star Wars as a child, though, I never really believed the Force was divided so cleanly, either. I could accept the idea of an all-pervasive force existing throughout the universe, but I didn’t understand why it would have borders.

I was an inquisitive (read: annoying) child on occasion (read: all the time).

Whatever else it is, in the hands of an appropriately sensitive being, the Force provides power. So does data. And the power of the Force can be used for Good or Evil (or areas in between), depending on the will of the user.

So can data.

It would be easy to discuss reporting bias and analytic assumptions at this point, trying to prove my case, but why don’t we just ask Obi-Wan Kenobi about what happened to Anakin and how he justified lying to Luke?

“So what I told you was true, from a certain point of view”

Or maybe the script for The Empire Strikes Back was rewritten, and Anakin became Vader at the same time (in our world) that Luke and Leia became brother and sister. Frankly, Anakin’s transformation was the less creepy of the two rumored script changes, based on prior events.

This scene isn’t even about a misuse of the Force. This was a misuse of data.

Does that make it any less powerful? Do the ends justify the means?

How many times is that question raised in this series, in terms of how the Jedi and the Sith attempt to vindicate their use of the Force to achieve their goals?

What was that about the Jedi compared to the Sith?

I could write a treatise on the how the similarities outweigh the differences, and how several examples of Jedi power in the films (say, mind control?) aren’t usually considered to be ‘good’ or ‘noble’ in most stories.

But that would be long and off-topic, and I suspect it’s going to be a significant theme in the remaining two films of the third trilogy, so let’s move along…

Data Surrounding Us

I am not trying to say that data is a mystical force, although some might.

He finds your lack of faith disturbing.

I am saying that it exists all around us: in our computers, in our phones (even the old ones), in wrist watches and televisions and cameras.

It’s also surrounding us in the air itself, when you consider that we’ve been using wireless communication since Marconi first set out to popularize the radio. Add in television, communications, wireless devices, and satellites, and it becomes difficult to deny that we are truly and completely surrounded.

Avoiding the Dark Side

I know, I said I didn’t buy into the idea that the Force had a Dark Side. That does not mean I didn’t believe that people do.

The Star Wars films make it pretty easy to tell the heroes from the villains, most of the time, especially where the Force is involved. Even Yoda claimed that the Dark Side was almost impossible to escape.

Once you start down the dark path, forever will it dominate your destiny.

Tend towards hyperbole, he did.

Analytics and data science do not typically hold that type of power over the analyst, but the principle does hold.

  • It is, unfortunately, very easy to modify an analysis by excluding certain data, especially if you find a process that tells you to exclude outliers.
  • It’s just as easy to tailor the results by applying assumptions that seem totally reasonable to anyone familiar with the project domain.

This is where we find a classic example of the slippery slope theory.

Small acts of selfishness or convenience can lead a potential Jedi to greater acts of evil until he finds himself well along the path to the Dark Side and finds himself unable to turn back. This is one path to becoming a Sith Lord.

Similarly, making small adjustments to your analytic results can seem perfectly reasonable when you’re pressed for time, or your results align with the stakeholder’s desired scenario with the exception of just a few statistically insignificant data points.

As time passes, it can become easier to find yourself making such adjustments more often with less reason, because you’ve become known as an analyst who gets the job done quickly and produces results that the stakeholders like.

Wielding The Force

As an analyst (or data scientist, or padawan), you have a lot of power at your disposal. By managing the presentation of the data, you are effectively controlling access to the data. Do it honestly, and do it right.

Thirty years of data collection led to this moment, once R2D2 finished its nap.

And if it helps to think of yourself as a Jedi (or a Sith), wielding a power that most people couldn’t control and don’t even understand, go for it.

Just maybe don’t go around trying to act the part. Keep that to yourself.

Force-choking is right out, not matter how annoying some people can be. Really, just don’t.

--

--

Greg Anderson
Creative Analytics

Founder of Alias Analytics. New perspectives on Analytics and Business Intelligence.