Clausewitz was a Data Strategist

Sarah Catanzaro
6 min readJan 26, 2017


(This post was adapted from a talk given at the San Francisco Machine Learning meetup)

Meet Eliza:

Eliza completed her undergraduate CS degree at an Ivy League university before taking a developer position at Google. After one year, she left to pursue a PhD in biomedical informatics. While preparing her PhD research, she developed a tool to more efficiently execute her studies. She saw the tool’s widespread applications and with the blessing of her PhD advisor (a consultant to Facebook), Eliza paused her academic incursion to start a machine intelligence company.

Before departing for Silicon Valley, Eliza recruited a technical co-founder from her PhD program. They moved into a tiny two-bedroom apartment in the Sunset where they developed an initial prototype. Shortly thereafter, Eliza secured seed funding through an AngelList syndicate and from some prominent micro-VCs. They hired a few well-paid data scientists and full stack engineers and subletted “real” office space. After an ambitious cold calling campaign and several on-site visits, Eliza won her first PoCs with 2 Fortune 500 companies.

Eliza sighed…after all her hard work, she had made it.

But let’s fast forward one year later…

Eliza is burning $200–300k/month and has 6 months of runway — she needs to raise a Series A. Her pipeline is replete with glossy logos but her PoCs are not converting into multi-year contracts. She hasn’t defined a repeatable sales process and without evidence of product-market fit, Eliza can’t find an investor to lead her round.

Eliza’s story is too common. Although many machine intelligence startups raised seed funding at the peak of the AI hype cycle, without strong recurring revenue or a well-defined sales playbooks, they cannot raise additional rounds.

The failure of the land and expand strategy is at the crux of this dilemma. While machine intelligence startups expected their PoCs to convert into larger deals, widespread enterprise adoption of ML products hasn’t occurred. To better understand this phenomenon, the Canvas Ventures investment team interviewed executives at several F500 companies who told us that they couldn’t specify success criteria for ML products; they couldn’t find extensible uses cases or cultivate avid advocates. When push came to shove, they were unable to show clear business impact.

What’s going on here?

Enterprise adoption of machine intelligence products will not accelerate until both vendors and customers implement data strategy. Startups and other machine intelligence vendors can deliver business value IFF they understand or help define their customers’ data strategies.

Pundits and business journalists have expounded upon the utility of data strategy but haven’t answered: what is data strategy?

While machine intelligence is an offset technology that can disrupt the way enterprises do business, machine intelligence is NOT a data strategy.

To explain the structure, intent, and implementation of data strategy, I will borrow from Carl Von Clausewitz — a Prussian general who authored a systematic, philosophical examination of war nearly 200 years before Clayton Christensen schooled us on the innovator’s dilemma.

Clausewitz reduced military strategy to three levels — the strategic, operational, and tactical — based on the premise that war is a complex business requiring unity of command. At war (and in enterprise settings), there are multiple units operating in dynamic environments, where direct communication may be limited. The Levels of War enable units to make decisions by clarifying the links between strategic objectives (vision) and tactical actions. The levels of war also help commanders plot operations, allocate resources, assign tasks and measure success.


Per Clausewitz, strategic guidance includes the long-term plans and policies by which an organization achieves its vision. Strategic guidance must always contain statements regarding the ends (objectives), ways (concepts), and means (resources) by which vision or policy is attained.

In the past, management consultants have described the ends as “what you want to achieve,” and the ways as “what you want to become.” While the ways/concepts will vary widely across companies, the ends can be summarized by a more limited set of strategic objectives including market expansion, market penetration, product development, product diversification, or operational efficiency.

Stipulating means is equally important since the strategic objectives under the strategic concept adopted must be achievable with the forces and resources expected to be available.

To clarify this exposition, I’ll present an example from Mattermark, where I previously led the data team.

At Mattermark, our strategic guidance was to “penetrate the financial research market by organizing the world’s business information through software automation.” In this statement, you see evidence of ends (market penetration), ways (organizing the world’s business information) and means (software automation). Notably, we selected software automation as our means because my team of less than 10 people could not perform manual data collection at scale.


The strategic guidance provides a framework for conducting operations. Operational guidance defines the use of resources to achieve strategic goals through the design, organization, integration and conduct of campaigns (e.g. product roadmaps). When designing operational guidance, you must consider conditions, sequence of actions, and resources.

First, you should evaluate the conditions that must exist to achieve the strategic objectives. Next, determine the sequence of actions most likely to produce those conditions. Finally, assess the resources necessary to accomplish this sequence of actions.

At the operational level, you will make broad decisions about data collection and analysis methodologies. For example, you might choose to use a rule-based approach instead of machine learning. These decisions will determine the personnel and IT infrastructure required to support a project.

Let’s return to the Mattermark example.

What conditions had to exist for us to penetrate the financial research market by organizing the world’s business information? We had to offer the most timely, accurate, and comprehensive data on startup funding. Next, we considered the sequence of actions to get the most timely, accurate, and comprehensive data on startup funding. We decided to use human analysts to review scraped news articles first, and then develop an NLP-based system that mimicked the human analyst workflow. We would not replace the human system with the NLP system until it could exceed the human performance. Finally, we agreed that we needed two data analysts, two machine learning developers, and one full-stack engineer to execute this “campaign.”


The last level described by Clausewitz is the tactical level, which is concerned with the planning and conduct of battle. At the tactical level, you make daily decisions about action items and task management. Tactical success is measured by the contribution of an action to the achievement of operationally significant results.

I cannot summarize Mattermark’s tactics within one paragraph of a blog post, so I’ll direct you here. Notably, as we proceeded through design and implementation, we constantly asked ourselves, “will this help us achieve the most timely, accurate, and comprehensive data on startup funding”? If the answer was no, we reevaluated our course.

So, in a few words, what is data strategy? Data strategy is the means, ways, and ends by which you will achieve your corporate vision; it’s the set of conditions that must exist to bring about your strategic objectives and the sequence of actions and collection of resources that will bring about those conditions; it’s the tactical decisions that drive both operational and strategic success. Clausewitz famously proffered:

“The political object is the goal, war is the means of reaching it and the means can never be considered in isolation from their purposes.”

Both vendors and customers must stop using machine intelligence in isolation from its purpose.

