Using Blockchain to Archive digital Data and Evidence Used in Open Source Investigations

Ro
Geek Culture
Published in
10 min readMar 26, 2021

I’ve always been interested in online investigations and open-source intelligence. There’s something fascinating about crowdsourced investigations and individuals coming together to look into real-world issues. On my journey into this subject years ago, I inevitably stumbled across Bellingcat. I recently finished reading Eliot Higgins’ ‘We are Bellingcat’ which dives into how he started Bellingcat and how it grew into what it is today. It’s a great read and I highly recommend it, find it here.

Whilst reading it, I noticed there’s a very interesting point surrounding the use of digital media as evidence in trials related to war crimes or any trials for that matter and the issues in archiving that evidence. Due to the nature of online investigations, the way evidence such as videos, images and maps is collected and stored can cause issues when later trying to use that evidence in a trial or even to write a story. For example a video uploaded to YouTube that shows evidence of war crimes or police brutality could be used in an online investigation, but when it comes to needing to download that evidence again, it could have been taken down by the person who uploaded it or by YouTube themselves, due to the video violating terms and conditions.

Another issue that arises when it comes to digital evidence being used in a trial, is that it can only be used if the chain-of-custody is documented to a high standard. There can’t be a chance that evidence could have been edited or changed since it was uploaded or submitted. This begs the question, how do we know things aren’t changed by the time we need to access the footage or evidence again?

In the book Higgins raises some great questions surrounding accessing such information, such as:
- Will everything that was once posted still be available ?
- Will it have been verified ?
- Will the clips be findable ?

Rights organisations and foundations often archive digital evidence and media that can be used for open source investigations or digital evidence, but it’s often not their primary focus, it can be mishandled or disorganised. It’s also difficult for judges or law enforcement to access this evidence and to make sure it isn’t edited or doctored in any way, it needs to be verified and the chain of custody needs to be clear. Judges and law officials need to be able to access verified evidence and be able to utilise it effectively.

Blockchain & Decentralisation

A thought that came to mind was to use Blockchain or a decentralized structure to archive digital assets or evidence. This would mean that the assets could be uploaded to a transparent, distributed ledger to be stored and kept safe. When the asset is then needed for a story or trial it could be downloaded with a clear record of when it was uploaded, who it was uploaded by along with its verification. There’s clearly a problem surrounding how to validate evidence that is uploaded, as with any investigation, evidence needs to be clearly verified and backed up with further evidence, to ensure that it is authentic and what it says it is. However, I don’t believe it would be for the network to validate this but the people or organisation using or publishing the evidence to do so. As with most decentralised networks adding a network fee to access or send assets would be an incentive to upload legitimate evidence and stop a flooding of false information. There could also be private chains that require authentication before being able to upload or download data for specific cases.

Filecoin and DSN’s

The Filecoin Network made by Protocol Labs is an existing network that allows users to store and retrieve information, known as a Decentralised Storage Network or DSN. A simple explanation of this pulled from the Filecoin site is as follows:

‘Filecoin is a peer-to-peer network that stores files on the internet, with built-in economic incentives to ensure files are stored reliably over time. Available storage and pricing is not controlled by any single company. Instead, Filecoin facilitates open markets for storing and retrieving files that anyone can participate in’.

Filecoin was created to combat the centralised cloud services that dictate prices and cost users extortionate amounts to use. The network works by users uploading data to the network, that data is then given a unique identifier key that can then be used to retrieve it. First, the user wants to store data, miners then compete to win the storage contract, the user selects the winning miner and the data is stored. To earn Filecoin the miners must prove that they are storing the data properly. This is done through cryptographic proofs and miners submit their storage proofs in new blocks to the network and validate new blocks on the network, only blocks that are correct are accepted. By doing this it incentivises miners to be honest and store data properly and efficiently.

To retrieve files the user looks up miners who may have the file and they then select the fastest or most affordable miner, the client then pays the miner to retrieve the files. The more popular the file gets, as more people request it, the more miners can pick it up and re-host it. The data is spread to where the demand is growing which in turn optimises the access as the data flows globally.

Miners range from small operations to full data centres and by working together to host and properly store data, users and hosts both benefit.

From this brief (and not expert) overview of the Filecoin protocol you can see how a network like this could improve and benefit the storing or archiving of digital evidence for open source investigations. By storing important data on a decentralised network it improves the safety of the data, by having multiple miners store the data it lessens the risk of the data being lost, taken down or destroyed. As Miners have to prove that they’re storing data properly and safely to earn rewards it means that the integrity of the data is more likely kept intact. Another benefit of it being decentralised in the same way as the Filecoin model is that retrieving the data can become more efficient and can be accessed from anywhere, the more requested the file is. If multiple people request the file for other open source investigations or news stories for example, the more miners can host it and the file becomes easier to access and wider spread. In turn this makes the file less likely to be lost and stored more securely.

An analogy of this would be when a video goes viral online. The more it spreads the less likely that video is to be lost even if it’s taken down from platforms. As they get taken down it gets re-uploaded by someone else on the same or different platform. But if that video is removed or blocked from social networks or forums it could be lost. If that video is uploaded to the Filecoin network and then repeatedly requested by multiple users that know the identification key, the easier that data is to access as the data flow spreads. That file is then decentralised and less likely to be lost as its hosted by multiple parties.

A dedicated decentralised network

Although Filecoin is a reputable and efficient network, due to the nature of evidence used in open source investigations and the importance of some data, I believe a separate network built for the sole purpose of storing and archiving data for open source investigations would be beneficial. There is the potential for the network to be attacked or miners storing data to be attacked by bad actors, but if files are decentralised and stored by multiple verified miners the risk can be mitigated. You could also make sure that miners are validated/verified to make sure they intend to securely, efficiently and honestly store data. This can be done with financial incentives using a proof-of-stake or proof-of-work models. Filecoin has a great model for this using Proof-of-Replication and Proof-of-Spacetime in which storage providers have to convince clients that they stored the data they were paid to store, storage providers then generate Proof-of-Storage that the blockchain network or clients can verify. There are many different protocols that could be used within the network and I’ve not quite figured out what would work best…but we’ll get there.

This could easily be done by a company offering secure storage solutions but by centralising large amounts of archived or stored data such as digital evidence, the risk of a successful attack that could wipe it out is higher due it being all in one place, even with backups. By decentralising the data for archiving, it means that multiple validated miners can store it, so if one is attacked and lost the data is still kept safe by other miners or storage facilities. This also means that it can be accessed from multiple sources which can improve accessibility in retrieving the data.

One use case could be users or miners being Open Source Investigators, NGO’s, Archiving Foundations, Universities or anyone that has an interest in accessing or storing digital data for use in investigations and want to maintain the integrity if the data.

Chain-of-Custody

There could also be an implementation to this that proves chain of custody of digital documents, using a public ledger as the Bitcoin network does, however where the file is created and collected from would need to be stored. As well as this, the data would have to be validated before being uploaded to prove its authenticity and this verification would also have to be uploaded with the data. This would then mean that for anyone downloading the data they would also have the verification to prove the evidence is legitimate. An example of this would be uploading a video showing war crimes in a country and the verification could be geolocation data or imagery and a write up of this evidence to verify the location, date, time and what’s seen in the footage. This way, when the data is downloaded by someone else they can easily verify the evidence or recreate the steps to check it themselves. The data on the block could then have a verification badge. If data is uploaded without supplementary evidence to validate it, it could have a mark against it to display to users.

Once the file is uploaded to public ledger and verified by the miners in a network it would be kept there and accessible by anyone, this means that it could be downloaded and viewed, as its verified by miners on the network it means that it couldn’t be altered or edited, therefore proving chain of custody. For this you could use a similar implementation to the Bitcoin network but rather than storing hashes and transactional data it would be digital data or files. The next node would then contain hashes of the previous block so that they can be verified by each node. By uploading a file that could be used as evidence to a decentralised network that’s then recorded on a public ledger, it means that the original can be kept safe and is there for all to see, this mitigates the risk of the data being taken down and lost from other platforms.

Implications & Costs

There are obviously issues with this implementation as there are costs to the use the network. Costs comes in the form of financial and computational power. This is used to incentivise users and to protect the integrity of the network normally in the form of proof-of-work or proof-of-stake. Within the Filecoin network they use Proof-of-Sotrage, Proof-of-Spacetime and Proof-of-Replication, if you’re interested in this check out the whitepaper.

Depending on the implementation there may be scenarios where user may not be able to upload data or complete data transactions due to financial or computational cost. This is where Rights Organisations or archiving foundations could come into help. This could also be worked around by the networking using governance tokens. The network could reward users with its own governance tokens which can then be used for transaction costs such as how Rarible does with Rari tokens. This would mean that active users who upload or store information would be rewarded each week with a split from revenue created on the network. There could then be an option for users to donate or pay forward transaction fees using the governance token which would allow those who cannot pay the fee to upload/download digital data such as video or photographic evidence. This would be beneficial for those who are unable to access digital funds or living in oppressive societies where the government control the internet or financial freedom.

Conclusion

In my mind, nodes or users that store/access information could be in the form of Open Source Investigators, Investigative Reporters, NGO’s, Open Source Foundations or people who are interested in that area. This means the network would be made up of like minded users who can can safely and efficiently store and access data for use in Open Source Investigations, Trials or simply news stories that benefit the population by sharing information. What that actually looks like, I don’t know, but I believe that there is a benefit and need for a network such as this.

This is simply an initial concept for an effective way to archive digital data that can be used in open source investigations, trials or news stories. I am by no means an expert (as you can probably tell by this article) but wanting to share the concept to open a discussion surrounding the topic. If anyone wants to discuss this further or even look at implementing a network such as this, get in touch!

If there’s any incorrect information or explanations in this please also let me know.

Find me on Twitter

--

--

Ro
Geek Culture

Design, Research and all things technology