Big Data: Value

Joanna Tan
CISS AL Big Data
Published in
7 min readJan 24, 2021

First of all, let’s just all agree that COVID-19 is terrible. Horrendous. Deadly. [Insert negative adjectives here]. For some people, it is merely the new norm of having to wear masks and social distance that is irritating. For (a LOT of) others, it means losing their jobs, losing family members, losing their lives.

“As a result, the U.S. unemployment rate shot up from 3.8% in February — among the lowest on record in the post-World War II era — to 13.0% in May. The rise in the number of unemployed workers due to COVID-19 is substantially greater than the increase due to the Great Recession.”

Pew Research Center

https://www.cnbc.com/2020/09/08/why-the-real-unemployment-rate-is-likely-over-11percent.html

Small businesses, especially ones that require social interaction (i.e. restaurants), took the hardest hit, but it is definitely not easy for most other businesses as well.

When your ship is sinking, you should use a lifeboat. When your business is dying, you should use big data.

Take advantage of past company data and simulate cash flow and model scenarios where you can ensure a positive cash flow. Look at your sales data and market to target customers based on their profiles, purchase history, preferences. Collect data about your target customers and see how they might need your service during the pandemic.

Photo by Carlos Muza on Unsplash

The problem used to be that we don’t have enough data. That’s not the case anymore. We have MORE than enough data, but your mindset needs to change. If you don’t store customer profiles, their purchase histories, their preferences, you can’t use this data. Recognize the value in data NOW, and store them for use LATER.

“…in the age of big data, all data [should] be regarded as valuable, in and of itself.”

But what’s valuable about real-time GPS coordinates? Old search queries? Clicks on a website? Misspelled words?

There is far more value in data than you might think. Why?

1. Fewer limitations on the collection of data

We can now collect data without much effort or even awareness due to significant advancements in technology and data storage. In fact, it is so cheap to store data that the better option most of the time is to store data instead of discarding it. Since it is cheap to store data, much more data are available at a lower cost than ever before.

Take a look at Google (who is surprised?). In 2016, Google partnered with Banco Bilbao Vizcaya Argentaria (BBVA) to develop a web-based data set that collects data from Google search queries. Using the data set, they were able to predict the checking in and overnight stays of travelers in Spain in real-time. What if Google decided to discard its search queries instead of storing it?

If Google did not understand how valuable data is, it could very well have lost a LOT of data. Google didn’t know how search queries could be useful in the future and it didn’t need to; it stored the data anyway. It was cheap to do so too!

2. Data can be reused

The primary purpose of data is often fairly obvious. Website clicks can help optimize website content, listeners’ favorite music can help demonstrate user preferences, weather data can help forecast weather on future days, etc. But wait! Think about it. Will the data you use disappear once you are done analyzing it? No. (unless you mess with a database with no backup, which literally should NOT happen.) Data can be processed over and over again, but the secondary (tertiary, quaternary…) purpose is usually harder to spot.

3. Data is a non-rivalrous good

That’s why we have open data. If you using the data that prevents another person from using it, then it's a rivalrous good. Since that is not the case, it is non-rivalrous, meaning that data CAN be used by everyone at the same time (it might not be, but it can).

“Data’s true value is like an iceberg floating in the ocean. Only a tiny part of it is visible at first sight, while much of it is hidden beneath the surface. Innovative companies that understand this can extract that hidden value and reap potentially huge benefits.”

Data can be reused for different purposes as well as recombined with other sets of data. However, the reuse of data can sometimes take different forms — hidden forms.

To better understand what I mean there, let’s dive into our daily life — we all know we can’t live without our dear spell checkers. Spell checkers are just like our personal genies: they are (basically) omnipotent. They correct the mistyped words and even know what you originally intended to write.

Seriously…nobody knows how to spell “entrepreneurship.”

Well, Microsoft and Google both developed their own spell checker systems, but Google was the winner to smile at the end. Why is that? (We love Microsoft, please don’t attack!!)

Both companies used dictionaries as the established sources for known words, but Google went further.

“As an incidental outcome of people using the search engine everyday… it reused the misspellings that are typed into the search engine among the 3 billion queries it handles and obtained a free best spell checker.”

This anecdote effectively shows that bad, incorrect, and defective data can still be very useful. But Google didn’t stop here; It even applied the system to many other systems like autocomplete and translation machines.

How genie-us, Google. (Haha!)

Then, is treating old data as rubbish losing the opportunity? The answer is, yes (duh-)! Companies like Yahoo and Infoseek failed to appreciate the value of misspelled search terms and lost the opportunity (try again next time?).

But in the case of Jennifer Hanawalt, a third-year clinical psychology graduate student at Wayne State University, used data originally collected by the NICHD’s Study of Early Child Care to investigate how pregnant women’s feelings and expectations about parenting correlate with their children’s self-regulatory behavior. Limited in time and money, she used data collected years ago which contained years of information. As a result, she was able to conduct a novel research.

On the other hand, data can be obtained from a completely different source as well: data exhaust. Now, it’s an era where the digital trail that people leave in their wake or online interactions can be used as a source to create values, like improving an existing server or developing new ones.

Let’s take Twitter as an example. Somebody tweets “first ice cream on the day of the first snow.” Just the username and this tweet may not provide a lot of insights, but with exhaust data in the form of metadata — like location, posted time or date, the number of followers, the ID of the device, the data can be used in significant ways! For instance, the data can be used by the retail stores when stocking up winter supplies like winter coats or by media streaming companies when streaming holiday movies.

Photo by Marten Bjork on Unsplash

On top of this, a lot more data is now open. And no, open data does not mean cleaner data (urgh!). Governments, which can often compel people to provide information but are ineffective at actually using it, gave the private sector and society the general access to experiment around with the data.

France is one of the earliest open data adopters, and the city of Issy-Les-Moulineaux published its financial budget in 2011 for transparency increase. Providing great descriptive context to its budget data, France allows companies to exploit its budget data and implement the data to carry out useful conducts, like creating a forecast of income and expenditure.

At this point, you should realize that you are currently living in a world where data has become so valuable and accessible; data can be exploited and are being exploited in every possible way.

However, measuring the given data’s values is still challenging. There is no obvious way to value data, but baby steps like looking at the different strategies data holders apply to extract values are underway. Other ways to price or measure data are getting experimented with. DataMarket, which provides access to free datasets from other sources like the U.N, lets anyone sell the data they happen to have in their databases.

Data has tremendous values, and it inevitably has remarkable influence over our daily lives. Congrats! You’ve now accessed the treasure map and the lifeboat you need to stop your ship from sinking. Explore on, Captain!

Key Takeaways

  • Recognize the value in data now, and store them for use later
  • Because data can be collected with fewer limitations, can be reused, and is a non-rivalrous good, there is far more value in data than most people think.
  • Treating old data as rubbish = Losing the opportunity
  • Many companies design their systems so that they can harvest and recycle data exhaust and improve an existing service or develop new ones
  • Open Data: extracting the value of government data by giving the private sector and society in general access to try

This article is based on the book Big Data: The Essential Guide to Work, Life and Learning in the Age of Insight.

--

--

Joanna Tan
CISS AL Big Data

Entrepreneur | Innovator | C++, Java, Python | Computer Music