Strategic Data Acquisition

Tobias Bohnhoff
shipzero
Published in
7 min readMay 17, 2019

Despite the immense economic success of Google, Facebook or Amazon and the hype surrounding AI and machine learning, the topic of strategic data acquisition is usually still underestimated. In this post we provide you with a structured framework on

  • what kind of data acquisition strategies exist,
  • how data can gain exclusivity and value,
  • which methods can be used to estimate data costs, uniqueness and value

in order to find new and clever ways in the constantly intensifying competition for data.

The value of data

Data has a value. In times in which algorithmic models automatically control processes and provide recommendations for optimal customer acquisition or customer value optimization, data has become a central value driver and thus a relevant asset. Especially behavioural data of consumers in areas such as search or movement have become of great interest. Only the pricing of data is an unsolved problem, which leads to an overall market instability and allows extreme monopolization or oligopolization — publicly often discussed referreing to GAFA (Google, Amazon, Facebook, Apple).

Adam Smith’s “invisible hand” of the market does not intervene if the unfairness of the price is not recognized by the market at all. This is rather a methodological failure than a reproach to the data monopolies. In addition to that, regulations in this area are extremely complex since the framework conditions change almost daily as a result of technological innovation.
How is the fair value of data to be determined? Three basic approaches can be found, but none of them is mature and offers an optimal solution for all applications.

1) Cost-based approach: The value of a data set is determined by the costs necessary for its creation. This approach is strongly limited, because on the one hand it does not reflect the actual value resulting from the analysis and processing of data, and on the other hand it primarily prices infrastructure costs (storage, processing and costs of data measurement) that are neither stable in themselves nor comparable among different data types.

2) Market-based approach: The second approach looks for benchmark figures in the market. Here, too, the use is severely limited. Very few data offerings are comparable one-to-one, and even rarer is a price transparency in a global data market. The approach may gain importance in the future, but is currently hardly practicable.

3) Income-based approach: The data value is a projection of the future cash flow derived from the data asset. This approach is suitable for niche applications that already have a proven business model. However, it is by no means a general solution for all data sets that occur in the scientific and business world. In addition, projections of innovative business models are fraught with high uncertainty.

What is the conclusion to be drawn from this situation? An exact pricing for own data assets as well as the determination of a fair price for the acquisition of data is almost impossible. It is important to closely observe developments and concept ideas in this area and to primarily create an indicative framework for one’s own strategy, which at least creates references for future decisions with the help of approximate values and comparisons.

The “Flywheel” as gold standard for data-driven business models

Amazon’s marketplace strategy, more precisely the self-reinforcing effect of supply and demandm, is a difficult to initialize but extremely stable model once it has gained traction. It is often referred to as Flywheel and is considered the ultimate model for platform-related business models.

Amazon’s Flywheel Platform Model

The unfair advantage arises from a balanced trade of the participants with the platform in an ongoing scaling process. From an initial added value, the core driver “growth” ensures that both customers and suppliers are willing to continuously provide more “assets” (including their data) to the platform, as this gives them even more value in return. These effects reinforce each other and ensure a high level of “lock-in”.

Data-based business models, regardless of whether they are based on machine learning, analytics or pure data processing, usually have a basic structure of:

Input -> “Black Box” -> Output

Input is the acquired data (more about the ways in which this is possible later on). In the “black box”, the magic happens: distilling information from the input data, which in turn has a business value as output. This could e.g. be through a better decision-making security, transparency, automation or attribution, which ultimately leads to a profiteer generating additional value.

This extreme simplification is intended to show that it makes a lot of sense to think about what the specific data input is and what the goal is — later to be achieved with the output in the very early stages of data projects. The magic in the “black box” is indeed a very exciting topic, but hierarchically subordinated to the goal definition and the strategic data acquisition.

Accordingly, a profitable business model relies on an arbitrage approach, in which the input is generated or purchased more cheaply than it can be monetarized as output minus the “processing costs” in the black box and the corresponding distribution costs.

In order to establish sustainable structures, it is also crucial that it is as difficult as possible to copy the model. The more accessible the source data and the more obvious my “black box recipe”, the shorter the phase in which I can operate the arbitrage model profitably. If one now assumes that in times of open source platforms, global data science communities and enormously high transparency through fast communication channels, the intellectual property no longer lies in the algorithms, the logical consequence is that strategic data acquisition will be the key differentiating factor of the future.

Strategic data acquisition will be the key differentiating factor of the future

13 dimensions for evaluating strategic data acquisition

If strategic data acquisition is the one success criterion for future business models, what does a practical evaluation of the relevant data look like? No matter if startup, corporate or mid-sized company — everyone has to think about the problem to be solved, how the technical implementation can look like, which data is needed and whether this will generate sufficient differentiation and knowledge to create a sustainable, profitable business.

The following morphological matrix shall be a first structural design to approach these questions. The ultimate goal is to further develop the approach in such a way that a complete and viable evaluation framework is created, which tests a data-based business model (purely functional) for the three main criteria of strategic data acquisition:

  • Data acquisition costs
  • Uniqueness of the data
  • Resulting data value

Without suggesting any scientific evidence, all three criteria seem to be positively correlated at the first glance. In order to build a comprehensive evaluation framework, 13 dimensions are considered below that can have a positive, negative or neutral influence on the above mentioned criteria.

The dimensions are divided into six core questions:

1) WHY — Business Challenge
For what purpose should the data be used?

2) HOW — Data Science Challenge, Data Engineering Challenge and Data Collection Method. Data Acquisition Model, Data Access
How do I translate the business challenge into a data science problem, how do I collect suitable data, how do I prepare it accordingly and how can I access it?

3) WHAT — Data Objective, Data Typology, Data Format, Data Scope
What kind of data does my concept structurally build on, what format and what scope does the data set have?

4) WHEN — Data Capturing Frequency
How regularly do I receive new and current data?

5) WHERE — Data Origin
Where did the data originally come from?

6) WHO — Data Uniqueness
Who else has the same/comparable data besides me?

For each dimension there are different characteristic values, which ideally should not overlap but in sum cover the complete option space. For some dimensions, however, several characteristic values can be effective at the same time. This form of structuring, which was actually developed as a creativity technique, should ultimately help in the evaluation of data-based business models. Not as a panacea, but as a building block to evaluate the topic of strategic data acquisition in a structured way. The result is no monetary comparative value for data sets or business models. However, a comparability of approaches emerges and different strengths and weaknesses become visible.

The future ambition of this approach is it to develop a scoring model behind the characteristic values, which pays in on the three core criteria and thereby allows a comparison of concepts in simplistic form. Any combination of characteristics would receive an individual score in the dimensions of acquisition costs, data value and uniqueness.

Next Steps and Conclusion

The great challenge in applying a predefined dimension space and corresponding characteristic values is completeness. Especially radical-innovative ideas have the ability to adopt extremely unusual characteristic values. The choice and coordination of dimensions must also undergo a further practical test. As the number of cases in which the model can be tested increases, adjustments can still be made over time.

It should be noted that the evaluation of data assets is a complicated matter. At the same time, this challenge will become increasingly important for strategy and business development departments in existing companies as well as for entrepreneurs looking for the “next big thing” in the context of AI and data-based business models or complementary services in this field.

If you have any feedback or would be interested in the further development of this approach, feel free to contact me at www.appanion.com

--

--

Tobias Bohnhoff
shipzero

Founder at appanion.com. Technology enthusiast and passionate about trends and innovation in artificial intelligence.