How Startup Culture Saves — or Kills — Open Source and ML

Steve Moore
Inside Machine learning
4 min readJan 15, 2018
Wikimedia Commons image

The more data scientists I meet, the more I’m struck by how many come from the world of academia — from physics, chemistry, biology, information science, and more, where they’ve used machine learning and other techniques to do everything from processing data from accelerators to visualizing molecules to modeling ecology.

And when they talk about this work, the operative pronoun is almost always “we”. They describe their own teams and their wider fields of study in language that echoes the long academic tradition of collaboration: learning from one other, trading ideas, and celebrating shared successes. Over the decades, that ethos around data science, born in academia, has permeated projects across industry, non-profits, and government where experts embrace collaboration to confront the toughest data science problems of our time — from self-driving cars to resource allocation to geo-engineering.

But why exactly did the spirit of collaboration survive and thrive from its origins in academia? That wasn’t inevitable. Before we dive into startup culture, let’s take a minute to ask what made collaboration work in the first place?

Three things come to mind:

  • Lowered expectations
  • Continuous experimentation
  • Narrow focus

I know that’s an odd mix of factors, but let me elaborate…

Lowered expectations

Most of us have heard of the many “AI winters” that fostered fears AI would never achieve its imagined potential. That recurring sense of disappointment meant that for years the most adventurous research was often conducted by academics rather than by commercial firms eager to monetize it.

With the rise of bolder entrepreneurs and startups over the last 10 to 15 years, that’s changed. Now, venture culture — as much as academia — has found a role in pushing the edges of AI research.

But for AI researchers at colleges and universities, those long years out of the limelight had unexpected consequences. In particular, the collaborative ethos of academia had a chance to get woven into the bones of how AI work gets done: open-source code, open-source tools, and a general framework for honoring those who contribute their work for the benefit of all. By the time commercial applications became viable, the ethos was established — and teams at firms like Google, Amazon, and IBM simply inherited that culture, and fostered it.

Continuous experimentation

Innovation in AI and machine learning has always meant feeling forward by experimentation. We might think of the AI winters as a sequence of failures, but from another angle they’re a testament to decades of tireless exploration.

And now of course that exploration and experimentation informs the work outside of academia. Here again, startups and entrepreneurs in particular have embraced the ideas of failing fast and often — on their way to insights that create cascades.

Startup culture and data science share a culture of following hunches, testing theories, and seeing what comes back. Progress and insight relies on trying — and often failing — with small experiments inspired by big questions. That doesn’t mean hunting blindly in the dark. The goals can and should be clear, but how to reach them is rarely obvious.

Narrow focus

Academics are criticized sometimes for working on such specific and silo’ed problems that they miss the chance to point their efforts toward a common goal. It’s a criticism that’ll sound familiar to passionate entrepreneurs who know how quickly grand goals become paralyzing distractions. In the case of AI, we could say that the grand goal is artificial general intelligence (AGI). As a goal, it’s wildly ambitious, distant, and still fuzzy — and reaching it will mean incredible teamwork.

But does that mean researchers should abandon their narrow focus to work on integration? Actually, no. In fact, as we inch toward AGI, we’re seeing a vindication of narrow focus. Earlier, I gave a rosy picture of collaboration in academia. That’s accurate, but it’s also true that AI researchers have tended to organize themselves into discrete camps. In a great post on pwc.com, authors Alan Morrison and Anand Rao describe five tribes, each with its favored approaches and algorithms: Symbolists, Bayesians, Connectionists, Evolutionaries, and Analogizers.

By focusing narrowly and going deep, each tribe has been able to make progress that might have been impossible if they were each trying to incorporate the full spectrum of other AI research. In an interesting twist, Morrison and Rao point out that the narrow focus is paying surprising dividends now that we’re seeing more collaboration between the tribes. There’s a dawning recognition that intertwining the array of machine learning methods is the only hope of addressing the most complex challenges — from AGI to quantum modeling to neurophysiology.

Not that it’s easy

Collaboration got baked into the AI culture, but that doesn’t make it easy or straightforward. Just as an example, we’ve all witnessed the hassles of open source development, which can feel unwieldy and uneven if contributors aren’t gracefully encouraged and guided. And as with any team effort, disagreements and hurt feelings are inevitable. But happily — despite the occasional frictions — the culture of collaboration is strong.

We seem to be on the edge of great advances in AI and machine learning — and we’re likely to see the ongoing co-evolution of machine learning and consumer products along with wide-ranging progress in healthcare, education, biology, and beyond. But as exciting new businesses continue to build themselves around AI, we need to recognize that the economics can threaten to undermine our common commitment to open source, open platforms, and shared ideas.

I think whether those causes live or die is up to startups and entrepreneurs. They’ll either be the heroes or the villains of open source and machine learning. I’m holding out for the ‘hero’ option — but we’ll only get there if we keep fostering the camaraderie that’s already taken us so far.

--

--

Steve Moore
Inside Machine learning

IBM Story Strategist. Machine Learning researcher. Speaker. Teacher. Opinions are my own.