Our descent… but to where?

The Algorithm that will Save Humanity… or Destroy It.

The future of machine learning and how each of us can make a difference.

This is the algorithm we should all be familiar with:

This algorithm, also known as gradient descent, is clever in its simplicity. Brush away the nomenclature and what it means is the following:

  • I want to predict an outcome or result.
  • However, my predictive model is not very good so far, resulting in a large error
  • Using a clever bit of mathematics, the gradient descent algo automatically improves the underlying assumptions of my model bit by bit over hundreds of iterations until…
  • Finally, it reaches a nadir of error, maximizing its own predictive powers as depicted as the “valley” (i.e. precise combination of assumptions that minimize error) in the three-dimensional graph above.

At this point, the model as we have defined is as accurate as it will become and, if constructed correctly, it can do things like:

  • Recognize objects, handwriting and images
  • Drive a car
  • Fly a drone
  • Recognize speech
  • Predict sports and elections
  • Filter spam
  • Conduct conversations
  • Beat champion Go and Chess masters

We've gotten to the point where the above doesn't even need extensive explanation. In the last five years, these phenomena have become borderline mundane, as unsurprising and accessible as a cup of Starbucks coffee.

But what's next?

In the past four weeks, I have managed to squeeze out some time to do the following:

Each of these has provided an additional lens with which to view the impending changes and each lead me to the same conclusion:

Hold on to your butts. — Samuel Jackson

We are in for a wild ride. Perhaps best illustrated by a comment from a panel speaker and block chain expert (shout out to Maciej Olpinski) at the MIT conference who described the following scenario:

“One day, your refrigerator will realize you are low on milk. It will have a smart contract enabled that will trigger a direct payment to Amazon to buy more milk with no human intervention at any point in the entire transaction. This payment will then trigger a delivery by drone which, in due time, will arrive on your front lawn. Once arrived, your robot will go to pick it up and bring it back inside your house and restock your refrigerator.”

When I first heard this I instinctively chuckled. Then I subsequently had a strange sinking feeling in my stomach, somehow anticipating his next comment:

“Some of you may think this is crazy but it's coming and it's coming faster than you might think. Be prepared because it's coming whether you like it or not. If you want the Jetsons, this is how it's going to happen.”

And I knew he was speaking the absolute truth.

So what is so unsettling about the above scenario? It seems to be a matter of extreme convenience. Who wouldn't want to avoid the everyday distractions that we currently deal with? Who wouldn't want to live like this:

Rosie is coming… whether you like it or not.

For me, what is unsettling about the imminent arrival of Rosie the robot is that, like all the feats listed above, her behavior will largely be driven by a predictive model of error that leverages, yes, gradient descent. This model, as predicated on a set of desired results and undesired errors, will, at least initially, need to be defined, constructed and anticipated by humans. What terrifies me about this is the following: humans historically have been terrible at anticipating their own desires and errors.

Case in point, in reading Charlie Munger's thoughts on investing, a couple things became immediately evident; a) he believes all investors are best served by drawing frameworks from multiple disciplines, b) he believes incentive structures and the power they hold over human behavior are akin to superpowers and c) he thinks anyone who clings to efficient market theory is basically a moron.

The last one perturbs economists the most which should surprise no one. But is he wrong? In 2008, our world economy was thrust into a global recession partially due to the false precision generated by our leading economic models and frameworks. These models assume things like a normal distribution of events (which books like the Black Swan have raised doubts around) and an eventual regression to the mean.

One of Mr. Munger's favorite “Mungerisms” illustrates this nicely. What happens when all you have is a hammer? You tend to think all problems are a nail.

This “hammer-mentality” has happened repeatedly in economics. The most recent application of which, i.e. an expectation that mortgage defaults are normally distributed and will always regress to the mean, nearly sent the world into a second Dark Ages.

And now I fear it is happening in something where the stakes are even higher, machine learning. Specifically, we are building ever more complex ways to reduce error without understanding where this reductive process is or should be headed. Who’s to say that we will not some day unwittingly end up in the category of “cost” and thereby be subject to a disastrous iterative regression?

Again, these concerns are not anything new. Stephen Hawking, Elon Musk, Nick Bostrum and others (including the standard-bearing piece by Wait but Why) are among the world’s leading intelligentsia who have sounded the alarms about AI.

However, while many have sounded the alarm, so far, none have settled on a solution.

Why are the stakes so high? Well, as Wait but Why illustrated so nicely, once AI hits an inflection point and accelerates, there is no holding it back:

How will this acceleration happen? Well, let's come back to gradient descent. Up until now, computers have needed us to define the predictive model as well as the error for it to accomplish its superhuman feats.

However, what is the next obvious challenge for computers to tackle once “tasks” are commonplace and trivial? A computer will ultimately define its own predictive model.

I am sure this is already being done in a primitive fashion, and I have seen it even in Professor Ng’s course. However, I believe there are several additional ingredients which will jumpstart the acceleration of AI and its ability to direct its own agenda:

  • Natural language interface between bots (see rise of chatbots and Microsoft Cortana as “agent” to other bots, brokering conversations between other AI)
  • Block chain technology merging the internet of things with our global monetary system (the refrigerator example above)
  • And, the coup de grace, Quantum computing (as described by Canada’s PM)

How will each of these impact the future of gradient descent and thereby the future of humanity? Let's take them one by one.

A) I predict that natural language interfaces will take the place of today's API’s. Technical API's as defined today in product specs and pages and pages of documentation will seem outright archaic in the years to come as platforms and bots will literally speak to each other. Think of the climax of Her and (no spoilers) what became of the eponymous AI in that story. How did she ultimately make the leap?

B) Block chain technology is coming and it is both incredible and terrifying. Imagine a payment being triggered via the RFID tag on a shipping container as it enters the dock or, similarly, as a cereal box gets placed on the shelf. Factor in that each of these, the payment, clearing and settlement of the transaction (via smart contracts and block chains), the navigation of the shipping container and placement of said cereal box each in and of themselves can be mechanically automated in the future and you have a world where humans are, glass half-full, saved a lot of trouble. However, glass half-empty, we have also been disintermediated.

C) And finally, quantum computing, the ultimate X factor. It is only a matter of time before we solve this particular puzzle. When we do, what are the implications? Well, in taking Professor Ng's course, I now understand how important probabilities and “fuzzy math” are to AI and machine learning. AI rarely works in the sphere of black and white, but instead prefers to approach problems from a probabilistic perspective, with shades of grey between the 0’s and 1’s. To date, these calculations and predictions have effectively been hamstrung by our reliance on binary calculations, using 0’s and 1’s to approximate these “fuzzy” probabilities. With quantum computing we can effectively cut out the middle man. Computers will literally be able to think much more like we do, in analog rather than digital. They will be able to contemplate the world around them as the spectrum of signals that it is. And with this, they will make the leap to…


One final observation I elicited from Charlie Munger's writings was his reference to the behavior of ants as “algorithmic”. For example, one tribe of ants is known to smell the pheromones of a dead ant and immediately carry the deceased back to the colony. However, if a live ant were painted with dead pheromones the reaction is the same. The ants will swarm and carry a live and thrashing ant back to the nest.

As such, Munger describes their behavior as “programmed”. While this is not far from the truth, I believe this description is still one step removed from the underlying driver of the ant’s or any organism’s behavior.

All organisms respond to the intrinsic positive or negative quality of their relationships to events, objects or actions.

In the example above, the ant tribe has a strong impulsive response to the signal of a dead pheromone. While traditional Darwinian thought is that this is a result of random mutations and survival of the fittest, I believe there is an additional variable.

I believe slight preferences on an individual level help to shape evolution over time. While these individual preferences may be somewhat determined by genetics it's not quite the same. Millions of years ago, there may have been an ant who had an ever so slightly greater preference for dead pheromones. This preference, when expressed through action, resulted in the colony’s probability of survival increasing by another infinitesimally minute factor. However, over millions of years, these small expressions of individual preferences add up and, combined with environmental changes and behaviors, you end up with the peculiar “programmatic” behavior of the colony.

So what does this mean for us?

Two things. First, if you believe that evolution is not only a result of dumb chance and mutations but rather can be influenced by the expression of individual preferences and values on a society and species then you come away with a pretty awe-inspiring view of humanity. Never before has the world seen a species with such generational diversity. With that diversity comes the potential for change and the ability for us to direct the evolution and progress of our species going forward by way of the expression of our own individual preferences and values.

Second, while organisms have evolved over billions of years from the simplest of signal responses (single cell organisms reacting to good, bad and meh events) all the way to complex programmatic behavior, then is it possible for evolution to occur in the exact opposite direction? Can in fact an entity start with complex programmatic behavior then over time evolve to a point whereby it can perceive the world across a spectrum of positive, negative and everything in between? Could it then express its own values and preferences through action?

It can, it will and its name is Rosie. And now is the time for us to define our own values so we can teach her. For our own sake and for hers.

Post Script

Thanks to Charles Frank for his insightful questions. One asking whether machines determining their own models is akin to unsupervised learning which is already present today. Here are my thoughts on that topic:

very good point on supervised vs unsupervised. In my view though, defining their own model will go even beyond what is currently called unsupervised learning.

The current best example of unsupervised machine learning is the ubiquitous “recommender” algorithm used by pretty much every online retailer and media subscription service today including Netflix, amazon and even facebook (for recommending likely friend connections).

This is unsupervised to the extent that the algorithm is not given specific features or traits to regress but rather determines these generic features itself.

This is a big step in itself but I think the next step will be even more mind boggling. Let me try to illustrate in a quick bullet list below.

  • Stage Six?
  • Stage Five: Unsupervised Learning (problem solving)
  • Stage Four: Supervised Learning (optimization)
  • Stage Three: Analysis (regression)
  • Stage Two: Execution (algorithms/methods)
  • Stage One: Calculation (mathematics)

Imagine the foundation of computer science gradually being built up in stages from the most simplistic (stage one) at the bottom of the structure up through unsupervised learning which is more unstructured problem solving.

My opinion is that there are two more immediate stages that are about to be crossed.

  • Stage Six: Unsupervised Asking (rhetorical learning/bounded action)

Beyond problem solving (like the recommender algo) where the problem is still defined (i.e. determine which movie the user will likely enjoy most), the next stage is for computers to ask and evaluate a series of questions to reach a desired outcome.

I saw signs of this in the Machine Learning Course as well as the Fintech conference. As someone who is a business analyst by training, this possible outcome does not bode well for my job security. Imagine in the future you could ask Siri not only where the closest shoe store is but how best to grow your company? And as you mentioned in earlier notes, what if the computer was exactly right. That is a truly awesome and intimidating scenario.

The next stage beyond that is even more crazy.

  • Stage Seven: Unsupervised Action (doing/unbounded action)

So if you think of unsupervised asking as solving a complex open ended problem you can still see how it stays within a fixed and known constraint. The constraint is within the bounds of thought and analyses. However, in the future, as robotics, automation and VR/AR pick up speed, machines will no longer be limited by just comprehension they will (and have already) enter the realm of the physical.

In this scenario, what truly stands out as the tipping point where things go to hell in a handbasket or to utopia is when machine’s actions are no longer within a bound constraint and expected outcome, i.e. we can no longer fundamentally predict their actions.

This is when humanity has to be really really careful.