This ‘20/80 Rule of Big Data’ has huge implications for IoT tech

Image credit: theartofphoto — stock.adobe.com

By Mike Vladimer

Modern product designers face an overlooked challenge in extracting value from big data. One of the classic heuristics in business is the 80/20 Rule — or Pareto Principle — which posits that 20% of the input results in 80% of the output. I argue that when it comes to big data, the Pareto Principle is inverted. Essentially, there is an 20/80 Rule of Big Data, which means that an 80% analysis only captures 20% of the value.

Image credit: Orange Silicon Valley

The Pareto Principle

The idea behind the 80/20 Rule is that the world is uneven, so strategically focusing your efforts makes you more effective. Vilfredo Pareto first articulated this when he noted that 80% of the peas in his garden came from just 20% of the peapods. For Pareto this meant that he should focus the first 20% of his time in the garden (input) on harvesting the fattest peapods, which held 80% of the peas (output). After that, Pareto stopped working because it would have resulted in very little benefit: 80% more work for just 20% more peas.

Image credit: Orange Silicon Valley

The 80/20 rule is valuable because it applies so widely. In business, companies often find that 80% of their revenue comes from just 20% of their customers. In economics, roughly 80% of the wealth is owned by 20% of the population. And so on. If you’re trying to achieve an outcome, like increasing revenue for your business, you should focus on the top 20% of your customers. The remaining 80% would be a lot of work, for little benefit. Although the 80/20 rule applies in many cases, the rule breaks when applied to big data.

Image credit: Orange Silicon Valley

Big data is causing a paradigm shift

It’s the beginning of 2018, and we’re in a paradigm shift around data: People and machines are now generating more data than ever before. Until recently, data was never captured at scale. For instance, it used to be that your friend would write you a letter, you’d read it and throw it away. Now, machines capture data like that all the time. When you send an email, Google stores that message on its Gmail servers — along with meta-data, such as when you received it. Part of the idea behind the internet of things (IoT) is that this same phenomenon is happening with devices. Now, every time you turn on the lights or take your temperature, there’s an IoT product ready to record that data.

Consequently, 90% of the world’s data was recorded in the last two years, according to IBM. This means that all of the data ever recorded — from the ancient Egyptian hieroglyphics to Gutenberg’s first printed Bible to the “Happy New Year!” text message you sent right before midnight on Dec. 31, 2015 — is less than the data created in just the years 2016 and 2017. This change in the world explains why we often feel overwhelmed with information — most of which has little value. The 20–80 Rule of Big Data offers a framework to gain more control of our data-saturated world.

Image credit: Orange Silicon Valley
Image credit: Orange Silicon Valley

The problem with big data is complexity so the solution is simplification. IoT and other big data products often overwhelm their consumers with information. Take an IoT thermometer as an example. The raw data for that IoT thermometer is:

{time : 11.00am, temperature : 65.0},
{time : 11.01am, temperature : 65.2},

There are about 10,000 minutes in a week, so you can see how the amount of raw data quickly balloons. This creates a challenge for product designers to give their consumers a simple, useful takeaway.

Personally, I’ve seen too many IoT devices where the “takeaway” is just a time-series graph that essentially says to consumers “Here’s some of the data analysis; now you figure out the rest.” That stinks. The first 80% of the data analysis does very little for the consumer: It captures just the first 20% of the value. That’s because consumers buy products to solve problems. When a product gives a consumer a “To do” list, rather than a complete solution, it doesn’t really solve a problem. By contrast, I’ve seen a few products where the data analysis yields a meaningful, actionable takeaway that completely solves the consumer’s problem. That’s how the last 20% of the data analysis captures the remaining 80% of the value.

Let’s look at how the 20/80 Rule of Big Data applies to thermometers. First, consider a conventional thermometer (not IoT) that tells a consumer that it’s 62°F outside right now — it doesn’t capture any data, so the 20/80 Rule doesn’t apply. Next, consider a typical IoT-ified thermometer that continuously records temperature and stores that data in the cloud. That typical IoT thermometer presents the consumer with the current temperature along with a time-series graph showing temperature over the last few days. This is a poor product because it makes the consumer work to solve their problem and captures very little value. By contrast, an IoT thermometer that abides by the 20–80 Rule will distill all of the data it has into a concise meaningful insight: “Yes, you need to wear a jacket today.”

The key to the 20–80 Rule is identifying and solving the consumer’s real problem. When a consumer asks “What’s the temperature outside?” product designers need to recognize the consumer’s real underlying need. The consumer asks about the temperature because that’s the only question conventional thermometers have ever answered. An IoT thermometer that provides lots of temperature data doesn’t solve the underlying problem. A great product designer recognizes that even though a consumer asks about the temperature, their real concern is “Should I wear a jacket today?”

As an engineer, I can relate to the desire to show all the data to consumers. The idea of only saying “Wear a jacket” feels overly simplistic. But most consumers don’t care about the detailed data, so showing them all the data adds complexity and detracts from the value. Products are about satisfying consumers.

Of course, there are some risks with the 20/80 Rule of Big Data. First, it’s crucial that the product designer identifies the correct underlying problem. If the consumer’s real question is “Is today a good day for the beach?” then presenting a concise answer about wearing a jacket is annoying, not helpful. Product designers can mitigate this issue by carefully studying their consumers and clearly articulating their product’s value proposition. Another challenge is personalization. For some consumers, 62°F is jacket weather whereas other consumers will wear shorts and a T-shirt. Product designers can mitigate this by incorporating subtlety and feedback in the IoT product: “You should probably bring a jacket today” and ask “Did you wear your jacket yesterday?”

Big-data products have the unique ability to fully solve problems and the 20/80 Rule makes sure they live up to their potential. Conventional products provide incomplete solutions because they are limited by what they can measure and how they use data. A conventional thermometer only measures the temperature right now; it can’t store previous measurements or consider additional factors like weather patterns or personal preferences. A big data “clothing assistant” considers all of those factors. Underlying that big data product are new tools, such as artificial intelligence and machine learning, which allow us to answer even deeper questions. Our job as product designers is to identify the right questions and use our tools to provide compelling complete solutions.

The 80/20 Rule and the 20/80 Rule both show how to allocate effort. In classic, conventional systems the 80/20 Rule says that you should probably stop working after you’ve input the first 20% of your effort. With big data, the 20/80 Rule says the opposite: the first 80% of your effort yields only a little so keep going to 100%. I hope that I’ve made the case that when it comes to big data, you should carry your data analysis all the way through to a simple, actionable takeaway that delights your customers.

Disclaimer: The views and opinions expressed in this article belong to the author and do not necessarily reflect the position or views of Orange or Orange Silicon Valley.