There’s a new commodity in this world that could be more valuable than crude oil soon. It’s data.
Data will be the fuel that powers the next economic revolution driven by artificial intelligence. Just as crude oil powered the steam engines that led to the industrial revolution, data will fuel the machine learning algorithms that power the AI revolution.
Data will keep on growing in our era, just as sure as ‘death and taxes’. Some already exist, just like oil has been sitting under our feet for centuries. Others are waiting to be grown and harvested, just like wheat on a field or pork belly on a farm.
The internet gave everyone the ability to create digital content. With more and more individuals having access to smartphones and broadband networks, we are creating content that needs ever larger memory space.
Not only do we have more netizens creating larger content, we also have less of it being deleted. Look around you. How many people delete their email accounts and posts from social media, even if they, well… pass away?
I have friends who keeps their spouses’ Facebook accounts active as a sort of virtual memorial. On their death anniversary notes and pictures of remembrances are posted by family and friends.
From organic dead matter came crude oil. From digital waste we are going to get…raw data!
Human beings will consciously and unconsciously grow, farm and mine this digital commodity for as long as civilization continues henceforth. Unless the world gets destroyed, I can’t see how data will reduce instead of grow.
Server farms will become a sort of digital agriculture — to farm and grow the next most valuable commodity: DATA.
Data has been around a long time, but why is it becoming a new and important commodity and potential source of wealth only now? Well, just like oil had been sitting under our feet for the whole of civilization, their true usefulness didn’t surface until we invented the steam engine.
AI is driven by big data. In the past, data was analyzed only to help existing businesses optimize their product or operations. If you didn’t own the business, the data created by it would be of no particular monetary value to you. At the very most it could be valuable to a third party consulting or research business that sells reports or surveys.
But with the AI era dawning, new applications, products and solutions are being built on AI. Simply put, in the past data was a by-product of business or academic activities. Now, data is the foundation of AI driven startups and businesses!
Even when new methods of AI that reduce reliance on big data to build the initial models become feasible, collection of data will still be required as inputs as well as to validate outputs.
So if data is now a digital commodity, how do we price it?
The price of any commodity is based on demand and supply. The stable price point will be determined by where the perceived value by buyers meets the incentive required by producers.
But this is where science gives way to art. Value is often a matter of perception. And being a new sort of commodity that is only just beginning to become a ‘raw material’ for a new industry, the amount of profits that can be produced out of ‘input data’ for AI companies is still uncertain.
This situation is similar to the dot.com era and the more recent social media boom where new technology creates new businesses but no clear business models yet.
During the early days of the dot.com boom, venture investors and even the stock market were pricing startups based on the number of ‘eyeballs’ or unique visitors that their websites had. Eventually with social media and mobile apps that stabilized into a more realistic Monthly Active Users (MAU) or Average Revenue per User (ARPU) model. This was because online advertising and e-commerce had become established and there were precedent companies and statistics on which to forecast the revenue that each active user is likely to generate for the business.
With AI driven startups creating new products and services for both business as well as consumer use cases, it is hard to tell how much each dataset could be worth in all the different business models. But the base unit used for valuation could be ‘per usable data point’, just like unique visitors or active users.
In time to come it will all settle back down to classical methods that price companies based on a multiple of their revenue or profit figures, but for now your guess is as good as mine as to how much a set of data could be worth to a company that needs it to build a business on.
Ownership & Control
Ownership and control is another aspect where data could be similar to physical commodities like oil that already exists or crops like wheat that has to be grown.
Assume a pool of data already exists, like underground oil or earth metals. There is the legal ownership of the land/sea it sits in, and the rights to mine and sell the commodity itself. Examples of useful data that already exists are stock prices in an exchange or translated text (called translation corpus) in translation companies. These digital assets have ownership rights that can be sold or assigned exploitation rights for ‘mining and selling’.
For data that needs to be ‘grown’ like wheat or pork belly (two heavily traded physical commodities), there could a separation of ownership between the land it is grown on and the beneficiary of the resulting commodity. For example, a supermarket or retail mall could end up collecting a lot of useful consumer data as part of its business operations and IT systems. But it may not necessary want to monetize that as a separate business. However, an AI driven advertising company would want that data as input to predict what consumers buy and when. They could then set up a deal to help ‘grow and collect’ that body of data as it becomes available and use it. In return they would pay the property owner a ‘rent’ or ‘royalty’ to use their territory and the digital resource it generates.
Like physical commodities, digital commodities will have to be protected from unauthorized access, viruses and theft through cyber security and encryption measures such as firewalls, anti-virus and blockchain.
So what benefits will data driven AI bring to mankind? The most obvious one being: convenience and productivity in both industrial processes as well as consumer lives.
To dig a little deeper, the creation and ever growing amount of data has created its own problem and solution. How so?
The information age has caused information overload and addiction to the internet and social media. AI can help us filter information down to the ones that are most relevant or of interest to us.
AI can also help to reduce the information processing and decision making for us in both our work as well as personal lives. Of course, this is a double edged sword. Anyone performing a routine or repetitive function is likely to find themselves out of a job soon if they haven’t already been replaced by a machine.
But more on the downsides to mankind later. Let’s continue with the good things first.
Within the tech industry, there are some obvious giants that will benefit from this rising trend in data creation, storage and analysis:
- the companies that sell servers like IBM, HP and Intel
- the companies that provide cloud computing like Microsoft and Amazon
- the companies that sells enterprise data storage like Dell and HPE
- and the folks that sell the network-gear like Cisco and Huawei that powers connectivity to move data around.
Beneath these big, public listed companies are also smaller suppliers that manufacture the parts and accessories needed for building servers, cables and routers etc. I shall leave those lists to the professional investment analysts out there.
One potential private equity or venture capital opportunity are those companies that have accumulated a lot of useful proprietary data over the years but did not have any real use for it before. These are the ones sitting unaware on ‘oil fields’.
I mentioned translation corpus just now. This is a good example. The translation industry is one of the most stable and steadily growing industries around due to economic globalization. Boring to most and overlooked. But in recent years private equity is increasingly active in it. The fragmented nature of the space and the potential for machine translation and AI to increase productivity by leaps is very attractive to them.
One particular multi-million dollar Chinese startup, UTH International, has taken on the established big companies by buying up translation corpus to build a business out of data. They claim to have 15 billion+ translation sentence pairs. But when their VP of Business Development was in town a few months ago and had coffee with me, I was surprised to learned that even they had not thought of purchasing data from the smaller Southeast Asian translation firms that had over the years, accumulated translation corpus for minority languages not covered by the tech giants.
There are always two sides to the coin. Privacy will become the biggest legal issue of our times with big data and AI.
Different countries have different approaches to this issue. In the past, it was generally said that if there was one advantage China had in tech, it was the fact that the country’s citizens were generally more willing to part with some personal information in exchange for convenience and free services like mobile apps. Even that is changing now. One landmark survey shows that 76.3% of Chinese citizens now “ see certain forms of AI as a threat to their privacy”.
The other problem that data collection and monetization indirectly creates is much more subtle and long term, like the pollution of land, air and sea with fossil fuel and industrial agriculture — the pollution of air waves. With much of digital communication now going wireless, electromagnetic radiation is everywhere. How much repercussions they may have in the long run to human health is a huge technical topic in itself. According to this article by the National Center for Health Research in the US, the research done so far points both ways and are inconclusive. It may take another 10–20 years before the long term effects can be ascertained.
My own unscientific gut feel points towards the negative. Those of you who, like me, gets a one-sided headache after speaking on a mobile phone for a while next to your ear will probably understand.
In any case, with both industrial and consumer Internet of Things (IoT) creating ever more device connections via Wifi, Bluetooth, cellular networks and Low Power Wide Area Networks (LPWAN), we can safely say that modern civilization is soaked in invisible waves. As we grow, accumulate and utilize more and more data, connectivity and bandwidth will also grow with it. I hope the scientists claiming they are safe are right…Did they say the same thing about pesticides and steroids when industrial agriculture and farming started?
But for now, crude oil is getting harder to extract and there’s no more panning for gold in San Francisco. In our lifetime data is the precious commodity that could be sitting right under your feet (or in your hard disk, that is…)
Writing is my lifelong passion and my way of organizing thoughts. I also hope to create meaningful discourse in society based on reliable information. So feel free to leave me a response and contribute your knowledge and opinion. I will try my best to reply. However, if it is a business connection you seek, please do so on LinkedIn.