In 2013, I discovered Camlistore and began feverishly hacking on it. I really loved the vision: a personal, automatically-organized archive of all your data, forever. But as I was working on Camlistore, I kept thinking that the project really needed a different kind of database — one with versioning, deduplication, and synchronization built right into the core.
I started tinkering with these ideas in my spare time and was quickly consumed by them. I had been really inspired by the elegance and power of Git for years, and also by the functional programming style of tools like ReactJS. It seemed like a database that combined the decentralization, versioning, and synchronization of Git with strong typing and the feel of functional programming would make it much easier to store, move, and track all kinds of changing data.
One day after work, I shared what I was up to with my long-time friend and co-conspirator, Rafael Weinstein. Rafael had a lot of experience with synchronization from a previous life, and like me was deeply inspired by Git’s solution to this historically thorny corner of software design.
Soon, we had formed Attic Labs, and pulled together an amazing founding team of close friends we’d known from previous adventures.
But as we got to work, we realized that the essence of what we were building was even simpler and broader than we’d first thought.
Our world is simply saturated with data. Data connects every person, organization, application, and service. Yet even given the gigantic amount of effort that has been expended over decades to improve the way data is stored and queried, we are still banging rocks together when it comes to moving, sharing, and collaborating on data.
The most common way to share data today is to post CSV files on a website. But because a CSV file has no history, it’s impossible to know how that dataset came to be. Who created it? Has it been modified? By whom? Why?
Because CSV is untyped, it is famously dirty and difficult to consume. Because CSV is not itself a database, you can’t query just a subset of it — you have to pull it into something else first. Finally, you can’t fork a gig of CSV files and make a pull request, or `pull` just the changes since last week.
Git took over the software world virtually overnight because its decentralized nature enabled source code to move fluidly between computers, organizations, and people; and because this in turn directly enabled much richer collaboration.
We think that the world needs a way to fluidly share and collaborate on data. We think that a content-addressed, decentralized, synchronizing database is the natural, inevitable way to do this.
Today, we are presenting Noms: a new database that makes it easy to store, move, and collaborate on large-scale structured data.