Are Decentralized File Vaults the Future?

Andy Hodgson
Nationwide Technology
6 min readSep 8, 2021
Photo by Jason Dent on Unsplash

Combining future tech and lessons from the past

History teaches us that nothing stands still. Technology evolves at an ever-increasing rate and Nationwide are constantly evaluating both the benefits and threats that evolution brings. But it’s not just technology that changes; Regimes change, geo-political alliances shift and this sometimes happens rapidly and with far-reaching consequences for individuals and organizations.

In the world of cloud storage, global infrastructures and the use of 3rd party companies to store our data, how do we face into this and ensure whatever the future holds, our members can be sure their information is safe & secure?

Data security in the age of cloud

Building Societies and Banks are not typically the organizations that come to mind when talking about new technologies and approaches. But look behind the scenes and there are large numbers of people, teams, projects and programmes tasked with understanding the opportunities and threats the future will bring. Nationwide is no exception and amongst these resources is the Engineering Proof of Concept Team; Software engineers dedicated to trialing new methods, architectures and technologies. These kinds of teams are crucial to ensuring customer experience is continuously improved and data security constantly evolves.

Recently, this team has applied its focus to the challenge of data security in the age of Cloud and how Nationwide could maintain a secure, robust vault of its data. The outcome sought by the project has been to prove that cloud-based storage can be implemented such that it is resistant to state actor level attack and does not rely on the stability of a single Cloud provider’s enterprise.

Photo by Alina Grubnyak on Unsplash

Embracing decentralization

One answer to how we embed resilience into data storage is to consider the “Don’t put your eggs in one basket” idiom. We have come to expect that the major players in Cloud Services offer incredibly resilient platforms in which to store our information. And indeed, they do; A well architected solution can leverage extraordinary availability and recovery capabilities from such vendors. We expect the technology to be resilient, we expect the platforms and the data within them to be available when we need them. We expect the data stored to be managed as per our implemented design and not used for any other purpose. And we expect this data never to be shared or utilized without our permission.

But expectations depend on a status quo — an understood and accepted set of circumstances and parameters. Nationwide wanted the Proof-of-Concept team to challenge those and combine that thinking with more traditional approaches to security. We wanted to think about if a regime shift impacted the availability or the trust of these services, or if a vendor was somehow compromised.

A logical area for focus when looking for more baskets in which to place our proverbial eggs, was to look at decentralization. To utilize multiple storage nodes that had no one dependency on any one particular vendor or platform appeared to fit the brief but came with its own set of questions and concerns, not least of which were around how it would be securely implemented.

Patented security design

The key to addressing security concerns for this kind of decentralized approach was a patented security model created within Nationwide’s Application Security team. By incorporating this model into the core of the prototype, the engineers were able to build capability that decomposed data into scrambled sub-block levels. This is then combined with powerful encryption and distributed using blockchain inspired techniques. As a result, the data payloads sent to the decentralized nodes are incredibly resilient to compromise or attack.

This amalgamation of techniques allowed secure distribution of data across multiple nodes that included major cloud service providers as well as community cloud assets such as employee laptops. This provided flexibility in where the data was hosted but the design also needed to take into account the challenges that come with a decentralized model.

To address these, innovative solutions were embedded into the application that monitored the health of each node and self-heal the network if one were identified as being retired or faulty. In addition, techniques are used whereby the node is identified as being at high confidence of compromise with actions being taken to remove it.

Crucially, with this type of design we must remember that nodes are outside of our physical control. So, what would happen if one such node was compromised? Yes, the application has the ability to detect this and remove it from the network but what if a bad actor were in possession of that node? They then have unlimited time to attempt to break the encryption in place.

With the prototype, this scenario was baked into the design from the outset. The method in which data is deconstructed, reconstructed and distributed means that no one node is capable of reconstructing any meaningful data even if the encryption layer is broken. Defence against the potential collusion of multiple “rogue” nodes is also addressed and further, sophisticated sampling techniques are employed to ensure regular verification of data stored within the nodes.

This combination of tools and techniques provides storage that is not just flexible, economical and resilient to technical failure but also resilient to geographical outages or compromise and able to withstand state actor grade attacks. Additionally, it allows for data storage across multiple cloud service providers, none of which ever hold enough information to reconstruct the original data.

Progress so far

The project is now in its second phase, with the first successfully demonstrating the core components were able to implement the design to distribute and reconstruct the encrypted blocks of data across multiple nodes made up of Windows10 devices. Performance testing showed overhead on the devices themselves was minimal, but overall performance for completing the data distribution was disappointing and not seen to be something that could scale in its current form to a production-grade capability. Whilst a number of variables affected this performance, the primary contributing factor was found to be the underlying Blockchain framework implemented. This opensource framework provided a good amount of the distribution “plumbing” and allowed for rapid progress and proving of core concepts, but it became obvious that performance at scale was not achievable.

Phase two is re-engineering the Blockchain elements and has shifted focus from community cloud assets (i.e., The Windows 10 devices) to storing data exclusively in Cloud platforms; Specifically, AWS and Azure. As a third provider would be utilized within a productionised version to protect against provider collusion, a Google Cloud Platform instance is mimicked as a separate instance on the AWS platform.

Go has been adopted as the primary development language with which an event-driven, microservice architecture is being implemented. MongoDB instances and Kafka queues are utilized for transient storage and processing of the source data being transformed, encrypted, and distributed. For the purposes of the proof-of-concept, processing for the data transformation and distribution is provisioned by EC2 instances within the AWS environment, with resulting data blocks being stored across resilient AWS S3 and Azure Blob instances.

Phase two distribution and retrieval is currently scheduled to complete around early December this year.

What does the future hold?

2020 was the busiest year on record for cyber-attacks against UK firms with 2021 looking likely to be even busier still. Hacking attempts have surged by twenty percent with bad actors taking advantage of factors such as COVID-19 and remote working. It is more important now than ever to ensure constant improvement and evolution of the tools and techniques to protect our society’s data.

The work the Proof-of-Concept team are doing is just one of many initiatives both ongoing and planned that will keep Nationwide’s members safe from an ever-increasing threat of cybercrime. Something that sadly, we can say with some conviction won’t be changing anytime soon.

--

--