Data Science & Life— A Strange Analogy.
In a machine-learning interview the other day, I was describing a well-known algorithm to cluster a group of points into k clusters (k-centers) in Euclidian space: Lloyd’s algorithm. It’s quite simple, really, and so I’ll describe it in general terms here:
Given a distribution of points:
- Randomly pick k centers.
- For all n points, assign it to its closest center (squared distance).
- Simply pick new centers based on the clusters of points we have now.
- Repeat 2–3. Stop when our k centers do not change more than a small threshold ε.
Here’s a neat visualization that employs Lloyd’s algorithm to cluster points in 2D space:
Now, here’s where things get interesting.
Turns out, Lloyd’s algorithm will usually converge— but most of the time, you won’t get the absolute best or most optimum clustering. Even for two clusters, finding the placement of centers so that we have the least-distance clustering is NP-hard (meaning it can’t be done in polynomial time). For most scenarios, NP-hardness is bad. Very bad.
So what does that mean for data scientists— machine learning specialists?
There’s a couple strategic guesses that can help (based on point density, etc) but more often than not— you settle! Lloyd’s algorithm finds you a local minima every time it runs, for every initial placement of centers. When the centers stop moving, moving them one way or the other is only going to make the clusters worse— in a local sense.
I say you can draw parallels between Lloyd’s algorithm and how life operates:
- It all starts with randomness. Where you start is not a matter of choice. People are born into their environments— into their situation. Everyone wants to go places— everyone wants to be happy, to be the best person they can possibly be. But the possibilities— the opportunities— a large portion of it is where you start. Just like in our clustering algorithm, you can’t end up at the best possible clustering if your first pick at centers are all in the same corner of your plane.
- You optimize with what you know. Look, I’d love to be able to make decisions on a daily basis with a complete map of how the rest of my life is going to play out. I don’t— I can’t— do that. I act greedily, meaning on the basis of what I can make better right now. What can I do immediately based on what I know now? You see— that’s how Lloyd’s algorithm works— iteratively. It’s not a one-step learning process.
Going back to the fact that you can’t guarantee (or even expect) to arrive at the globally best clustering, this takes us to a (possibly) depressing fact of life, and the lesson that Lloyd’s teaches best.
You’re naturally aiming for a local maximum, with no global maximum in sight.
Here’s what I mean:
- When you marry someone, you’re picking the best option you have at that time. You have to. You honestly don’t know whether this is your soulmate— the best person out there for you (“global maximum”).
- When you pick a job, it’s about the best offer you think you can get. You don’t know what the best thing is going to be for your career— the best fit job (“global maximum”).
The list goes on— and this methodology is in every decision we make. At every turn, you move forward with incomplete information and with the conscious sense that you’re going towards the local peak of some abstract graph.
I’d like to end by noting one difference:
With clustering, time doesn’t help. It doesn’t matter how many iterations you make of Lloyd’s algorithm. Once you’ve started, you can deterministically say what the local maximum (or minimum, depending on what exactly you’re optimizing) is going to be. The only randomness is at the beginning.
With life, I’d like to believe you have a choice. It’s up to you to introduce more randomness— to shake things up a little bit.
Don’t let your life be deterministic.
If you found this blog post at all relevant to you or thought it was cool pressing the “Recommend” button below would mean a lot to me! ☺