Architecting a Decentralized GitHub Backup

Storj Labs
Oct 18 · 2 min read

GitBackup is a tool that backs up and archives GitHub repositories. The tool is in the process of backing up the entirety of GitHub onto the Storj network, which currently stands at 1–2 PB of data. As of today, October 18, 2019, the tool has currently snapshotted 815,200 repositories across more than 150,000 users.

GitHub is the largest store of open source code in the world, with 20 million users and more than 28 million public repositories as of April 2017.

We believe that this reservoir of free and open source code acts as a digital version of a public good, similar to a developers’ library — a library that empowers software engineers to access the collective knowledge around open source code, development patterns, and free software.

While GitHub is a wonderful service, it’s owned by an agenda-driven global corporation and is thus prone to downtime, blockage, and censorship by a single point of failure. For example, Microsoft’s acquisition of LinkedIn shows how user content can be gradually taken away (by means of paywalls and login walls).

Furthermore, on July 25, 2019, for example, a developer based in Iran wrote on Medium about how GitHub blocked his private repositories and prohibited access to GitHub Pages. Soon after, GitHub confirmed that it was blocking developers in Iran, Crimea, Cuba, North Korea, and Syria from accessing private repositories.

If we want to guarantee the preservation of the work of hundreds of thousands of open source developers, we need to act now!

Let’s download it all!

We’re currently using gharchive.org to get a list of GitHub usernames that have had a public action since 2015. So far the 815,200 repositories we’ve backed up constitutes about 80 TB of data. We anticipate that the entirety of public GitHub repos is about 1–2 PBs so we still have a way to go

If you want to backup your codebases’ repository (or all of GitHub) to the decentralized cloud, check out the tool, found here:

http://gitbackup.org/


Gitbackup was built by Shawn Wilkinson in collaboration with a number of Storj Labs’ engineers and community members. The tool was demonstrated on October 11 at Devcon V (Osaka, Japan).

By Kevin Leffew on Community

Originally published at https://storj.io on October 18, 2019

Storj (pronounced storage) is an open source decentralized cloud storage platform. Learn more at https://storj.io.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade