How Blockchain Technology is Revolutionizing Data Provenance

Oct 26, 2018 · 13 min read

For most readers, the word “blockchain” brings to mind cryptocurrencies like Bitcoin, Litecoin, and Monero — and the volatile price movements accompanying them. The cryptocurrency boom and bust has certainly taken the spotlight in recent years, but blockchain and Distributed Ledger Technology (DLT) have far more wide-reaching applications than digital currencies. This article will explore one of the most significant real-world applications of blockchain and DLT — data provenance.

We’ll start by defining DLT and blockchain technology. Next, we’ll define data provenance, and then explore how blockchain technology is revolutionizing data provenance systems for the better.

If you’re well versed in blockchain basics and just want to learn about data provenance, skip ahead to “What is Data Provenance?”

What is Distributed Ledger Technology?

A distributed ledger is an information database that is collectively maintained and synchronized by a network of different computers referred to as “nodes”. Every node in a distributed ledger individually maintains a full copy of the database. Through a majority voting mechanism known as “consensus”, nodes validate the ledger’s data by comparing it with other nodes.

Consensus in most distributed ledgers is achieved when the majority of nodes (at least 51%) agree on the correct version of data.

Centralised vs Distributed Ledgers

What is a Blockchain?

A blockchain is a distributed ledger that stores data in lists called blocks which are tied together as a chronological ‘chain’ of records. Before a new block can be added to the chain, it must be verified or solved mathematically — in the Bitcoin blockchain, this process of mathematical verification is referred to as mining. Miners are rewarded with Bitcoin for successfully solving the newest block. Once a block has been solved by a miner the block and its transactions are then independently confirmed by several other Bitcoin nodes (at least 4). Once confirmed, the block is added to the chain of records, completing each transaction.[i]

Distributed Ledgers and blockchains offer a number of fundamental advantages compared to centralized ledgers — immutability, improved security, greater reliability, transparency, privacy, and efficiency.


Blockchains are said to be immutable stores of information — this means that data stored in blockchains cannot be modified, appended or erased unless 51% of nodes agree to the changes. This property makes blockchains inherently more resistant to attacks by malicious actors compared to centralized servers like banks or company servers, which have one point of failure.

Security & Reliability:

Blockchains are designed to be both resilient to attacks and self-healing. If any individual node in a blockchain network goes offline or becomes unstable, the remainder of connected nodes — which may number in the hundreds of thousands — continue working to make the network available, and no data is lost or compromised.

Since there are no individual points of failure, blockchains are inherently more resilient than centralized systems.


In many ways, information technology and the internet have led to an erosion of personal privacy. Centralized data brokers and social networks like Facebook and Twitter track and store personal information about users — including those that have never signed up for their services. This personal information can then be sold to the highest corporate bidder — or perhaps turned over to government agencies — without the user’s knowledge or consent.

Blockchain technology provides a compelling alternative. Instead of giving personal information directly to private technology companies, user information like name and date of birth could be stored in a blockchain and be made available only temporarily for verification purposes. A blockchain solution for personal information would give users granular control over what information is viewable to outside parties, how long it is available for, and who it can access their information.

Transparency — Open and Private Ledgers:

Distributed ledgers can be set up openly or privately. In the case of open ledgers such as Bitcoin, every single historical transaction is recorded and can be viewed by anyone without special permission, providing an unparalleled level of accountability and transparency that benefits all stakeholders.

Contrasting open ledgers are private ledgers, where data can only be read and manipulated by users with the required access control. Private ledgers are suitable for enterprises that want to take advantage of the security and privacy benefits offered by DLT, but do not want their data to be made publicly available.


The Bitcoin network is often criticized for being an inefficient use of energy; on the day of this article’s publication, October 26th, 2018, it’s estimated that Bitcoin mining uses up as much electricity on a daily basis as the entire country of Austria, which makes up 0.3% of global energy usage.[ii] Newer blockchain technologies like Proof of Stake (Ethereum) and Delegated Proof of Stake (Blockpool) have been designed to solve the energy efficiency problem. For comparison, Ethereum — a Proof of Stake blockchain — uses approximately 0.1% of global energy consumption.[iii]

Furthermore, blockchain technology can greatly improve organizational efficiencies through automation. Smart Contracts — also known as self-executing contracts — are essentially a set of IF / THEN statements that automatically trigger events when certain conditions are met.[iv]

For example, a smart contract be could help a distributor of consumer goods such as Amazon:

IF a client submits payment for goods to a particular wallet address

THEN Amazon automatically gives instructions to their shipping warehouse to send the goods, while simultaneously informing the manufacturer to ship a replacement to their warehouse.

Now that we’ve covered some basic blockchain concepts, let’s get to the goods.

What is Data Provenance?

The concept of provenance originates in the fine art world where it describes the documented evidence that’s used to prove that a work of art has not been altered, forged, reproduced, or stolen.

Data provenance is a historical record for any piece of data. Data provenance systems track changes that are made to data, where data originates and moves to, and who makes changes to it over time. In other words, data provenance is “showing your work” in a database. This historical record of information can then be trusted for data validation and audit purposes.

Data provenance systems are vital for a number of industries and use cases. We’ll examine the following practical applications and explore how blockchain technology can improve upon them:

Brands & Supply Chains, Academic Research, Legal Video Evidence, User Authentication, and finally, preventing Human Slavery and Child Labor.

­­­Brands & Supply Chains (luxury goods, apparel, collectibles)

A supply chain is a system of organizations, people, activities, information, and resources involved with transporting a product or service from supplier to customer.

Supply chain managers rely on accurate provenance information to track progress and ensure goods move smoothly through each stage of distribution — from leaving the manufacturing facility, to wholesale distributors, retail outlets, and then finally to the customer.

When it comes to luxury items, brand-name goods, fine art, and collectibles, buyers want to have complete certainty that the items they’re purchasing — especially on resale markets — are completely genuine. With counterfeit goods making up 7% of global trade, this is a major concern for supply chain stakeholders and their customers.[v]

To combat counterfeiting, physical goods can be fitted with tamper-proof RFID tags, holograms, and QR codes that get scanned through each stage of the supply chain. This information is then recorded on a blockchain, providing stakeholders with a transparent, secure, and highly accurate audit trail.

Blockchain-based provenance systems are a benefit to both buyers and brands.

Buyers benefit by knowing that they’re purchasing authentic goods and getting their money’s worth. When consumers are confident in the authenticity of goods, brand reputation improves and suppliers are able to sell their goods at a higher price. Because blockchains are extremely difficult to attack, stakeholders are provided with greater certainty that data is accurate as compared with centralized ledgers.

Brands also use provenance data to track and improve quality control and auditing throughout their supply chains, leading to greater supply chain efficiencies.

The resilient nature of blockchains means that there is no centralized point of failure, and that provenance data is always available and secured.

Academic research

In the academic world, research is often conducted through collaborative efforts between different organizations. For example, drug trials maybe conducted through collaboration between universities, pharmaceutical companies, laboratories, and data analyst teams. This means that data is collected, managed, and analyzed by a number of different individuals — each of whom may have their own individual priorities, career goals and financial interests. Any user with administrative access to a database could change or corrupt data to their own benefit.

While we’d like to assume that academic researchers act honestly 100% of the time, the data says otherwise — research data can be fabricated, under-reported, and falsified to match the expected or intended results of a study. In an audit conducted by the National Cancer Institute, incidences of fraud as high as 0.25% were found in the results of clinical cancer trial groups.[vi]

A blockchain-based provenance system for research data could prevent against data manipulation by providing a complete, transparent audit trail of all data that is collected, processed, and accessed by researchers. Any modifications made to research data would require at least 51% consensus from stakeholders and would be visible to everyone — ensuring high data quality and preventing individuals from acting dishonestly.[vii]

Video Evidence

Video evidence can be one of the most powerful pieces of evidence in court proceedings, but with today’s technology, audio and video files can be digitally manipulated to say or show virtually anything.

In some cases, dash cam and body cam footage have been thrown out in court cases over authenticity concerns — time and date stamps can be altered, and video evidence can be edited in a biased manner.[viii]

A blockchain-based video storage and provenance system could revolutionize the use of video evidence in legal proceedings. By utilizing a blockchain-powered file storage solution like the InterPlanetary File System, video evidence could automatically be uploaded to a peer-to-peer network and appended with provenance information like time and date stamps, GPS locations, vehicle speed, and a complete historical record of changes made to the video. Since blockchains are immutable, prosecutors, defendants, or judges would not be able to modify the video to serve their own agenda, and all parties would have greater assurance that the video is completely authentic and unedited.

Identity Verification

When registering for services like bank accounts or credit cards, user verification is traditionally done using several pieces of personally identifiable information which may include photo identification, utility bills, and / or health records. This information is then stored and accessed on centralized company computers, and viewable by employees with the required permissions. With the recent hack and data breach at Equifax, the personal information of hundreds of millions of individuals was compromised.[ix]

A blockchain-based repository for personal information could eliminate the risks associated with storing personal information in centralized servers, while giving users ownership and control over their own personal data. Through a permissioned blockchain system, users could control who is able to view their personal information, what information is made available, and for how long.[x] In a blockchain-based identity verification system, users would be able to sell their personal information to advertisers or research firms — rather than give it away for free. Additionally, users would be able to see a full historical record of who has accessed their personal data.

Combining a blockchain-based identity verification system with unique RFID tags or implantable NFC chips could enable secure, multi-factor login systems for voting, accessing physical or virtual facilities, verifying ownership of goods, or completing financial transactions.

The unique RFID chip would act like a password (private key) or as a means of two-factor authentication (2FA). Because each chip is 100% unique and cannot be counterfeited, only the appropriate chip will provide access. Unique RFID chips are far more secure than other 2FA methods as they cannot neither be broken by sim card spoofing or by stealing a user’s phone to access authenticator codes.

By utilizing blockchain technology, the risk of a centralized server failure is eliminated, and all of the data used to verify a user’s identity is immutable, distributed, redundant, and available from any internet connected device.

Forced / Child Labor Prevention

While slavery was largely abolished in the 19th century, modern day slavery is still is an unfortunate reality throughout the world. According to the International Labor Organization and Global Slavery Index:

“An estimated 40.3 million men, women, and children were victims of modern slavery on any given day in 2016. [xi] Of these, 24.9 million people were in forced labour and 15.4 million people were living in a forced marriage. Women and girls are vastly over-represented, making up 71 percent of victims. Modern slavery is most prevalent in Africa, followed by the Asia and the Pacific region.”

Many of these people are working under forced labor conditions on fishing boats, construction sites, farms, in factories, or in the sex industry. The products made under forced labor conditions can often end up in commercial channels selling anything from textiles and electronic devices, to groceries.[xii]

An investigative report by UK’s Sky News showed children as young as 4 years old mining for cobalt with bare hands and feet for a meagre ~10 cent daily wage in the Congo.[xiii] Zhejiang Huayou Cobalt Company is the world’s largest buyer of so called “artisanal” cobalt from the Congo. Until very recently, Apple and Samsung purchased Zhejiang Huayou’s cobalt to manufacture batteries that go into smartphones, computers, and other consumer gadgets.[xiv]

A blockchain-based provenance system could help take away the temptation to work with unethical subcontractors like Zhejiang Huayou. Blockpool, together with the University of Manchester, UNSEEN, The Hartree Centre, and CDD — a leading compliance and due diligence technology firm — have proposed The Blockchain and the UK Modern Slavery Act (BC4MSA), whose goal is to build a blockchain capable of mapping forced labour intelligence nationally and internationally.

BC4MSA is designed as a secure data provenance solution that tracks forced labor incidents as ‘transactions’ on a blockchain. These transactions could store information about where and when the labor violation occurred, who was involved, and the type of violation. This data would then be verified by a UK-based NGO (ex: Unicef, Save the Children) or a welfare support officer, and then escalated to a criminal case with law enforcement. Forced labor incidents could also be shared between welfare support teams and investigative units in different countries to catch offenders red-handed throughout the world.

Blockchain technology makes BC4MSA a compelling solution for a number of reasons. Firstly, blockchain technology enables a trustless system where the integrity and availability of the network is not reliant on any single party. Secondly, the light-weight nature of blockchain technology enables efficient information sharing, minimizing investment and maintenance costs in the long-run. Lastly, because the data is encrypted using multi-signature technology, there is a lower risk of data breaches and confidentiality issues as compared with centralized databases.[xv]


As we progress through the 21st century, the amount of data we share and consume is increasing at exponential rates, which begs a question.

Should one have more trust for a centralized information database like a bank that is managed by humans, or for a distributed, automated, rules-based computing system?

In any situation with numerous independent stakeholders like banking, supply chains, and the various use cases outlined above, blockchain technology gives us more trust over information. Being decentralized, immutable, reliable, and highly secure, blockchains have the potential to profoundly improve the way we manage and share our data.

Acceptance of blockchain-based provenance information is growing. The Chinese government recently announced they will accept blockchain-based records as evidence in court.[xvi]

IBM recently announced Food Trust, a blockchain-based provenance system that tracks food through the supply chain, which will provide “unprecedented visibility and veracity into the sourcing and certification of fresh produce and proteins.”[xvii]

OpSec Security, a global leader in anti-counterfeit technology and brand protection, has partnered with Blockpool to develop secure blockchain-based data provenance solutions for brands and apparel manufacturers.[xviii]

Looking at not only cryptocurrencies, but blockchain technology as a whole, the future is certainly ripe for new implementations and innovations.


[i] Satoshi Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System”

[ii] Digiconomist, “Bitcoin Energy Consumption Index”

[iii] Digiconomist, “Ethereum Energy Consumption Index”

[iv] Alyssa Hertig, Maria Kuznetsov, “How Do Ethereum Smart Contracts Work?”

[v] United Nations Office on Drugs and Crime, “Counterfeit Products”

[vi] Stephen L George, Marc Buyse, “Data fraud in clinical trials”

[vii] Aravind Ramachandran, Murat Kantarciogolu “Using Blockchain and smart contracts for secure data provenance management”

[viii] Aida Ashouri, Caleb Bowers, Cherrie Warden “An Overview of the Use of Digital Evidence in International Criminal Courts”

[ix] Federal Trade Commission, “The Equifax Data Breach”

[x] Saif Rehman, “Password-less Authentication using Elliptic Curve Cryptography on Blockchain”

[xi] The Global Slavery Index, “Global Findings — 2018”

[xii] International Labour Office, “Global Estimates of Modern Slavery”

[xiii] Tom Cheshire (Sky News), “Child Miners: Firm Refuses to Apologise over Cobalt Sourcing”

[xiv] Todd C. Frankel, (Washington Post), “Apple Cracks Down Further on Cobalt Supplier in Congo as Child Labor Persists”

[xv] Ser-Huang Poon, Martin Carpenter (University of Manchester) “Blockchain and the UK Modern Slavery Act”

[xvi] Zheping, Huang, (South China Morning Post), “China accepts blockchain verification for evidence in courtroom”

[xvii] Aaron W. Stanley, (Forbes), “Ready to Rumble: IBM Launches Food Trust Blockchain for Commercial Use”

[xviii], “How Blockpool and OpSec Security are using blockchain tech to combat counterfeiting in the apparel sector.”


Real World B2B Blockchain Solutions


Written by


Blockchain enthusiast. Keenly interested in yoga, music, technical analysis, psychology, and philosophy.



Real World B2B Blockchain Solutions