The missing piece of the data economy puzzle

Published in

Scalia

5 min readNov 2, 2018

Data is literally the oil of the 21st century. The advent of the digital age has enabled data to help save lives, educate our children, push us towards more rational decision making. Despite some isolated incidents such as the Cambridge Analytica scandal, in the vast majority of cases data helps us to shape a fairer and smarter society.

In healthcare for example data is being leveraged to prevent big epidemics, test new drugs or to perfect existing treatments. Governments are using data to decide where they will open the next school or to map out new bus routes. In the fashion industry, sales data helps to predict consumer trends to reduce the number of unsold clothes. All in all, from fighting climate change to empowering the entertainment industry, exploiting data is set to positively impact every aspect of our modern society.

As one would expect, this new data economy is also creating major business opportunities. In 2016, the European Commission estimated that the European data economy represented €300 billion and that this number was set to double by 2020 to reach a size of €739 billion. On a global scale McKinsey argues that rethinking data usage would unlock 3 to 5 thousand billion USD of untapped value. This would translate into a 2 to 5% growth in global GDP!

The aforementioned examples show the key potential, role and added value of data in our society. Hence, one should address every impediment slowing it down. But what exactly is holding back the data economy from the next big upturn?

In our view, the primary enabler would be making data easily exploitable by any IT systems. Data standardisation is the #1 problem of the data economy and yet nobody is talking about it.

As we define it within Scalia, standardisation has 2 aspects:

A technical one: how is data encoded? In which format? And, within this format how is data structured?
A semantical one: which words should be used?

Having a way to exchange data rapidly and efficiently from one system to another or from one organisation to an another is crucial. The more systems and users can exploit data the more it can be leveraged. And yet, most sectors and industries haven’t agreed on a standardisation protocole because (i) it’s complicated and (ii) it requires for people to see the bigger picture.

This weak liquidity renders the exploitation of data marginal. It proves to be that sharing data is so inefficient, costly and legally complex that most entities have decided not to exploit this source of revenue.

History shows that the best way to address a problem is to find an economic benefit to it. But which business model should one go for in order to solve this issue efficiently? Below, we will review 3 potentiel business models and analyze if they are appropriate.

Searching for a business model

Marketplaces

In a ‘many-to-many’ setting where there are multiple actors on the demand and supply sides, a common tech solution would be to set up marketplaces according to overarching data typologies. The data owners and consumers can then trade between them subject to market forces. As such, we could envision a virtual marketplace for price history data of a given product, insurance statistics or investment data. An obvious prerequisite for this market to work would be to impose a data standard on all its players.

Unfortunately, these different approaches have been unsuccessful. Unlike material goods, one can easily copy data an infinite number of times at a small cost which hindering the law of supply and demand. The end of the Microsoft Azure DataMarket in March 2017 epitomises the clear need for a new economic approach for intangible goods such as data.

2. Data aggregation

Another approach is to offer a data consolidation service. This process works in industries that have a poor reliance on data or a long history in offline data sources. Regrouping and combining this data generates important benefits to the sector. For example, the art market was very opaque until ArtPrice started compiling art sales data in 1987. It now stands as the largest online database in the industry bringing more transparence to the market. Similarly, the housing market is subject to liquidity constraints in part because information is not readily accessible. By progressively digitalising land registries and by sorting data historically and geographically website such as Zoopla exchange structured information against personal data (namely user contact details and cookies) to sell it to relevant actors of the housing market.

In the aforementioned examples, the main players simply resell public data which is already accessible to anyone who has the patience to find it. As such, their added value does not lie in the data they provide but in the way they consolidate and structure public information. By becoming market leaders, such services become the de facto standard towards which everyone converges.

Even if this business model has proven to be applicable, is it the most suitable for every sectors?

3. Data customization

A third option has appeared more recently. Some sectors don’t require consolidation as a service. Instead, they are hungry for customization. Sure they need to share data around efficiently but they also need to display it in a unique way to remain consistent with their branding. This is especially true for B2C sectors where content marketing and SEO have become key components for success.

Any business who needs to display products online have come to realize that collecting content and displaying it in a unique and yet standardized way is absolutely critical for ranking and searching purposes.

It turns out that there are already some middlewares dedicated to streamlining raw inputs into structured data where one entity can set up its own transformation rules. They are called ETLs which stands for Extract, Transform and Load. Unfortunately, such middlewares are complex (and expensive) to set up and maintain and they aim for perfect transformation. A solution to off set the complexity is by combining deeptech suggestions and human training. As more organizations would use such ETL, through the effect of machine learning and the power of data network effect, suggestions will get smarter over time.

So far, data standardisation has not been popular because it is not a glamorous business. Yet at Scalia, we are convinced it is one of the greatest opportunities for our society and it also represents a huge market.

Unoriginally, it all starts by finding a viable business model which could differ from one industry to another.

After months of discussions with potential customers, we’ve decided to explore a whole new approach in order to serve poor data standards industries. By mixing a long-standing yet robust tech - ETL, an advanced web UX and deeptech, we are convinced that our platform will boost some sectors to the forefront of standardisation.

Thank you to Elliot Mitchell for his support, inputs and corrections in writing this piece.

The missing piece of the data economy puzzle

Searching for a business model

Written by Lancelot Salavert