How Machine Learning Can Make Something New

6 min readOct 31, 2018

We’re entering a new phase of industry adoption of artificial intelligence. The leaps-and-bounds progress of the past few years has ensured that everyone in the business world knows the potential for radical change in their industry. Accordingly, most technology-conscious professionals have made an effort to know about this paradigm-shifting technology.

We’re now at the point when people know enough to be dangerous, as I learned firsthand at a recent conference. Two influential attendees (a VC and a corporate development professional) separately expressed similar fundamental misconceptions around the capabilities of machine learning. In both cases, the individuals stated — with confidence — that machine learning could not use past data to generate novel or unexpected outcomes. In other words, ML could only perpetuate the past, not change the future.

To most people with a passing familiarity with statistics, this flawed conclusion probably feels intuitive. At its core, machine learning is optimization layered on statistics to draw patterns out of the past. And statistics is largely descriptive: sample enough members of a population and you can start making some conclusions about the group as a whole. This superficial understanding of the field might lead to reasonably-seeming conclusions like the following.

Let’s say you have an opaque jar of hundreds of marbles. You pull out 10 of the marbles, counting 6 blue and 4 green. You can then make some assumptions about the overall contents of the jar, such as the ratio of blue to green marbles. If the material were something other than marbles, you might even make some assumptions about yellow contents. But you have no data to suggest the jar contains cookies, so it would be unreasonable to make too many assumptions about the distribution of oatmeal versus chocolate chip.

With a basic understanding, machine learning may feel like the jar. You might say that ML can predict blue and green, but can’t tell you anything worthwhile about cookies.

In actuality, machine learning is more like a pantry than it is like a jar. Let’s say you’ve got flour, sugar, milk, and eggs sitting on a shelf. Let’s also say you’re a Cro-Magnon who has never tasted the joy of baked goods. If you were limited to the basic understanding of statistics (or ML), then your understanding of the possibilities of your shelf would be limited to combinations like sugary milk or strangely goopy flour. But that’s not how Cro-Magnons, statistics, or machine learning actually work. Instead, you begin trying different things. You mix the flour, milk, and eggs in random proportions. You leave it on a hot rock for a couple hours and get a weird crusty concoction. You try again with sugar this time and an even hotter rock. You try again, and again, and again. You begin to learn properties: more sugar means sweeter, more eggs mean a sturdier structure. Pretty soon, you can begin to make predictions about combinations you’ve never seen before: you know without having to try it that flour and sugar on their own won’t do anything good without some liquids mixed in.

This process of trial and error to understand underlying properties is closer to how machine learning works. While basic statistics often feels more like the jar metaphor, the reality of ML is often much closer to how humans themselves figure things out. Just as we’re not limited to basic linear combinations of inputs, advanced machine learning models can piece together large datasets into complex, non-linear shapes. It can come up with unintuitive — even novel — conclusions.

Part of the reason for the misconception about ML’s ability to generate novelty is how it’s explained. Experts try to dispel the hype by saying that it’s “just math” or “just fancy statistics”. And while those explanations aren’t wrong, the inferences that non-technical audiences draw from them very well can be.

One of the key differences between basic statistical models and even moderately sophisticated ML algorithms is scope — both of the action space and the layers of abstraction. The action space in our original jar example was very simple: you can choose blue or green. There were no layers of abstraction: there were just marbles. But even the very simple baking shelf metaphor quickly becomes more complex: you can choose among 4 ingredients and have no real limits on the amount of each ingredient to mix. Likewise, you can easily infer properties of the ingredients (a layer of abstraction), rather than just interpolating between the various mixes you’ve already tried. 4 input variables and a single layer of abstraction would be a trivially simple ML model. Advanced models have tens of thousands of inputs and dozens of abstraction layers.

Another key difference is the role of chaos (or variability/randomness). With the marble jar, the only randomness is in your sampling: sometimes you’ll get 4 green marbles, sometimes you’ll get 5. The baking example introduces quite a bit more: the temperature of the sun, the humidity in the air, and the bacteria in the flour can all have meaningful impact on the outcomes of your experiments. The unsophisticated approach to statistics might assume that these factors confound your ability to derive meaning from your data, but in actuality these chaos factors can lead to more interesting and novel patterns. Just as the earliest humans had to wait for lightning to create the right conditions for fire, the most complex ML models rely on a degree of chaos and randomness to fully explore the potential in a given domain.

One of the best illustrations of AI’s creative ability and the role of randomness is in the world of visual art. AI has been creating works of art for years now, as I showed in my article on AI exhibiting creativity. More recently, a piece of AI-generated art sold for over $400K at Christie’s. AI art is still largely abstract, but it is undoubtedly novel (thanks to abstraction and randomness!).

To make this practical, take a look at one of the watershed moments in AI’s development: AlphaGo’s victory over Lee Sedol in the game of Go. There was one particular move in the second game by AlphaGo that I believe provides some of the best evidence for the creative potential of AI (and ML more specifically). The conditions for this move had likely never happened in the history of Go and will likely never happen again — there are simply too many combinations on a Go board for late-stage moves to occur frequently. As such, AlphaGo’s training data did not contain the specific move; instead, the software had to make assumptions based on the patterns it had inferred from other situations. And that’s just what it did: AlphaGo made use of more general principles to make a move that shocked the professional Go community. Only in hindsight were commentators and spectators able to understand the “brilliance” of this move, a truly creative and original action.

Even more practically, we use past data to shape a different future every day at 4Degrees. One of our main intelligent features makes recommendations to our users about how often they should be engaging with their connections. It does so based on past behaviors of communication: how many times the user has talked with the connection, how the communication has been clumped, and the general nature of the connection’s relationship with the user. Based on the patterns it finds, our software makes recommendations for how often the user should be staying in touch going forward.

The unsophisticated view of ML (as exhibited at the conference I attended) would assume that our software is simply suggesting to continue the past patterns of engagement. But that doesn’t play out in reality. Our models are intelligent enough to recognize project-based engagement, for instance. A VC user may have gone back and forth with an entrepreneur a dozen times in a short time span, but our model has seen patterns like that enough to know that the VC may not be interested in keeping up a near-daily cadence going forward. Or it may have seen that a quarterly check-in fell off the calendar 6 months ago, and that it’s time to reach back out and revitalize the relationship.

Our software makes these non-linear suggestions all the time; it doesn’t always seem to make sense, but when it does it appears to our users as “emergent intelligence” — almost as if it possessed a level of sophistication that couldn’t be explained as “just statistics”. At the end of the day, it is — of course — just math. But when you’re used to thinking of ML according to marbles in a jar, you may be astounded to pull out a snickerdoodle.

Shout out to Heather Noe and Kathryne Dunlap for this article’s gorgeous visuals!

How Machine Learning Can Make Something New

Written by David Vandegrift