The Blockchain’s “Bad Data” Problem and Possible Solutions

Randall Mardus
Coinmonks
4 min readMay 15, 2018

--

Source: https://twitter.com/TheOnion/status/996106661117988865

Have you heard the saying, “Bad data in, bad data out?” Here’s an example: Let’s say you’re a kid who really wants to ride a roller coaster, but you’re only 53 inches tall, not the required 54 inches. You get to the front of the line, stand on your tip toes, lie, and somehow convince the ticket taker to let you on.

Unfortunately, you’re really not 54 inches tall and, in this case, it really matters which is what you find out as you rocket through the sky with, as The Onion will later report, “no regrets.”

Bad data in, bad data out.

The edX course, “Blockchain for Business — An Introduction to Hyperledger Technologies”, shows how the blockchain is good for supply chains by following a fish from when it’s caught to when the fish arrives at the appropriate restaurant.

The problem with this example is, if I’m the restaurant owner I won’t know if the fish is the weight and kind as described by the people who caught it until it arrives at my restaurant. The restaurant owner has to trust that the data being logged about the fish is accurate. Worse, if the fish is different from what arrives at the restaurant, how can the restaurant owner change that data on the blockchain which is immutable? That’s a problem.

This is my fear about the blockchain. Someone makes a mistake — or worse, intentionally — misreports a name, a number, or a transaction that goes onto the blockchain and that bad data becomes permanent because the blockchain is immutable, it doesn’t change and it cannot be changed. Bad data is all it takes for the blockchain to become a liability.

When bad data goes onto a blockchain, the problem just begins. That is because there are then several copies of that bad data out in the world compounding the problem of what is correct and should be trusted.

Possible Solutions

Fortunately, there are two ways to improve the quality of data that goes onto the blockchain. The first comes from Internet of Things (IoT) sensors. The second comes from APIs. Here are examples of how IoT sensors and APIs can improve the data we put onto the blockchain so we can get good data out of the blockchain.

Let’s say a hospital’s neonatal intensive care unit is concerned about the air quality in the rooms where the babies stay. The hospital could have an employee visit the room and test the air throughout the course of the day. Or the hospital could use one of the 85 IoT sensors currently available to test the room’s air however often it wants and record those readings on a blockchain dedicated to a particular hospital room. It could even sound an alarm or send a message to a dedicated staffer if the air quality level in a particular room fell below a pre-determined level. Like these air quality sensors, IoT sensors have the potential to be more accurate, dependable, and unbiased than humans inputting data by hand.

As for APIs, let’s say the Arizona Diamondbacks baseball franchise wants to find players that perform best in locations — such as Arizona — with high temperatures. The team could fly scouts out to games where the weather forecast is hottest for any given week, but that would cost a lot of money, the weather could change, the player they’re interested in may not even play that day or, worse, the scout decides to not watch the game because it’s too hot. Instead, the team could build an application that cross-references player statistical performance (home runs hit, bases stolen, strike outs by pitchers) from a source such as Baseball-Reference.com with the AccuWeather API to note the temperature, time, and humidity of the place where the players played. And, unlike a scout in the stands who may describe the weather as “really hot”, the AccuWeather API will act as a unbiased third party providing objective data (e.g., 107 degrees Fahrenheit, 87 percent humidity) that the Diamondbacks can trust is reliable and true. All saved to the blockchain.

Those are two ways to avoid bad data through human error before it goes to the blockchain for permanent record.

Having made these cases for IoT sensors and APIs, I still want to stress that while it may be difficult if not impossible to hack the blockchain, it would still be possible to hack the APIs or IoT sensors driving information to the blockchain. It’s also possible for someone to incorrectly set up how a sensor or API is used resulting in the gathering of the wrong data. Thorough testing in development is, as usual, highly recommended to help avoid this problem.

Follow for more

For more posts about the Blockchain for Business — An Introduction to Hyperledger edX course, the Hyperledger blockchain, and the blockchain in general, follow me to get the latest.

--

--

Randall Mardus
Coinmonks

Blockchain blogger; Upright Citizens Brigade & Second City sketch comedy student; Davidson Wildcat; New Yorker.