Beating Behemoths

While we like to believe the hagiographic portrayals of individuals single-handedly creating an autonomous vehicle or super-intelligent system, the reality is that there are billions of dollars, thousands of engineers and farms of servers behind each of Google’s machine learning innovations. The behemoths of tech are dominating research and development by any measure. Where’s the room for startups? This article will cover why the behemoths are working so hard in this field and, given those efforts, where there’s room for startups.

Big companies are serious about AI

Big companies acquired over 200 AI companies in the last five years, Google built custom processors for its engineers, Facebook pays AI developers seven figure salaries and more than 1,300 people working in Baidu’s AI research lab. We can summarize the AI-specific strategies of the behemoths very briefly, as follows.

Google: put AI in everything. Larry page said in October 2000 that, “artificial intelligence would be the ultimate version of Google”. Sundar Pichai stated that he wants to put AI in everything at Google, for both consumers and developers.

Amazon: create a new operating system and own the core infrastructure. The former is the Echo product and Alexa virtual assistant. Amazon is hoping that their voice interface is the basis of a new operating system for consumers, assuming some of the power currently held by Apple’s mobile operating system. The latter is the core, AWS infrastructure and machine learning APIs (natural language understanding, automatic speech recognition, visual search, image recognition and text-to-speech).

Apple: invisible intelligence. Apple is using machine learning to improve its consumer facing products, from Siri to the keyboard.

Microsoft: AI for the enterprise. Microsoft is building out a complete suite of services for enterprise customers including the Cortana suite (intelligent agents), a chatbot framework (used by 40K developers), Face recognition APIs, HoloLens, Pix (photo editing with perception), MileIQ (expenses) and Dynamics (intelligent CRM). This is in addition to the core, Azure infrastructure offering the leading FPGA cloud, fastest distributed framework for neural networks (CNTK) and CPU/GPU clouds.

IBM: Watson as a partnership vehicle to accumulate data. The Watson platform and partner ecosystem is used by companies with significant datasets such as Medtronic, Johnson & Johnson and Pfizer. The company hopes to continually improve Watson’s intelligent systems with data from these partnerships.

Facebook: maximize share of user attention with autonomous agents and AR/VR. Facebook sees 300M photo uploads, 4.5B likes, 4.75B shares and almost 1B comments per day. The company is using this trove of data to build image recognition, natural language understanding and collaborative filtering technologies that form the basis of intelligent agents to help you connect with more people around the world.


The financial imperative of big companies is to own the biggest market — cloud infrastructure. $111 billion of computing infrastructure spend will moved to the cloud in 2016 and this may increase to $216 billion in 2020. More than $1 trillion in IT spending will be affected by the shift to cloud during the next five years. Machine learning workloads are both data and compute intensive so owning those workloads means owning an inordinate amount of cloud spend on a per customer basis. Each of the big companies are taking a slightly different approach to owning the machine learning portion of cloud computing.

Google: Kubernetes will reset the playing field. Then, Google can sell the best GPU/TPU cloud, optimized for Tensorflow and coupled with services (Voice API, Text API and Google Cloud ML).

Amazon: Pure platform with maximum flexibility. Host any and all open source, and work with every platform.

Apple: Secure personal cloud

Microsoft: Customized models for Enterprise hosted on Azure

IBM: Setup and manage models as a service for enterprise customers

Facebook: Some developer services but long term plans are relatively unknown

What not to do

There are many implications of this massive shift to the cloud on the machine learning ecosystem. Machine learning practitioners looking to get their product into the world by starting a company should think about these implications. The most helpful dichotomy when thinking about your potential startup with respect to the behemoths is horizontal and vertical integration.

Horizontal integration is important if you’re in a low margin, utility-like business like cloud computing. Be something to everyone, and get a little money from everyone. Tech’s behemoths will thus do the following.

  • Invest billions of dollars in computing infrastructure, then provide access to infrastructure at near-zero marginal cost. Google and Amazon can do this particularly well because they built the computing infrastructure for themselves, amortized it over the last decade, and can now earn revenue without incurring much cost.
  • Accumulate and manage the petabytes of data required to train general machine learning models. Facebook can do this particularly well because it optimized its social network with a multitude of behavioral signals over the last decade.
  • Distribute consumer products with embedded machine learning through high-scale manufacturing and marketing. Apple can do this particularly well because it built an incredibly efficient supply chain and brand over the last few decades.
  • Build the leverage required to sign channel partnerships through which to acquire large volumes of customer data. Microsoft can do this particularly well because it built such channel partnerships to sell Windows in the last few decades.
  • Engage over long and expensive regulatory approval cycles. IBM can do this particularly well because it built a base of customers in regulated industries over the last century.

Examples of ‘horizontal’ products include:

  • Cloud computing infrastructure;
  • Basic statistical and visual analysis of the huge volumes of data stored in the cloud;
  • Machine learning as a service;
  • General image recognition, automatic speech recognition and natural language processing;
  • Consumer hardware with onboard machine learning;
  • Broad customer data networks; and
  • Diagnostic tools requiring FDA approval.

Startups are unlikely to have a comparative advantage building these products.

Glorious greenfields for startups

That said, the opportunities for startups are many. Every existing category of enterprise software, from sales & marketing to collaboration, will change with the advent of intelligent systems.

Every industry will see the benefits of systematically learning from huge volumes of data, from agriculture to pharmaceuticals.

Vertical integration is important if you’re in a high margin business. Provide an excellent service to a few, and they will pay handsomely. Startups can thus do the following.

  • Integrate with disparate sources of private data
  • Aggregate and clean customers’ private data.
  • Build products with predictive analytics features that solve for a specific problem of commercial or industrial significance
  • Build interfaces that collect specific data points from enterprise users
  • Direct, enterprise sales in a specific industry with channels specific to that industry.
  • Provide a excellent, personalized service.

Examples of ‘vertical’ products include:

  • Application Programming Interfaces that link objects and functions from different pieces of cloud infrastructure;
  • Advanced statistical analysis to extract value from private data;
  • Machine learning over private data to build predictive features;
  • Domain-specific image recognition, automatic speech recognition and natural language processing;
  • Enterprise software with workflow-specific data collection; and
  • Niche customer data networks.

This is

The bonus for startups is that they can use the infrastructure of the big companies to move faster once they’ve done the data collection and business development necessary to solve a vertical-specific problem. Another bonus is that, while the behemoths dominate headlines about general artificial intelligence, those companies are not leading the conversation in industry-specific fora. Startups today can become AI thought leaders in industries without one.


The strategic imperative for tech’s behemoths is to win the biggest markets. The biggest market in tech right now is cloud computing and the biggest cloud workloads are data-intensive, machine learning applications. This means that startups have to go high and narrow, up the application stack and deep into industries. The good news is that the aggregate opportunities across all functional categories of enterprise software and all industries is much bigger than the cloud opportunity alone. The only requirement is focus!

You may also like “Vertical beats horizontal in machine learning