Archiving your Google Sheets Data with Qri

Chris Whong
qri.io
Published in
3 min readOct 29, 2020

Over the past 15 years, Google Sheets have become a popular way to manage human-scale data for individuals and teams. While spreadsheets aren’t always the best approach to enforcing data quality standards, collaborative editing in the cloud beats emailing files back and forth, and teams everywhere are firing up google sheets and setting up data tables that can be easily exported to CSV.

Qri (pronounced “query”) is a version control system for datasets. Our software is designed to make datasets more robust, documented, portable, and trustworthy, in ways that file-based data and traditional databases cannot. When you create a Qri Dataset, each version is signed, hashed, timestamped, and immutable… a snapshot of the data that carries crucial metadata with it and is easy to move around.

So Google Sheets is great for editing data, and Qri is great for archiving and sharing data… I think maybe they should talk!

The Setup

I found a Mashed article about the best pizza place in each U.S. state and figured it would make a wonderful 50-row dataset in Qri, but I wanted the convenience of Google Sheets to manually modify the data. I copied the state and pizza place name from the text on the website and created a simple table with two columns: state and pizza_joint . I then added a website column that would need to be manually filled in. Here’s the sheet. If you want to help fill in the data, request editor access!

For demonstration purposes, I’ve written a node.js script that downloads the sheet as a CSV, then uses that CSV to run qri save on the Qri dataset chriswhong/best-pizza-place-in-every-state. Finally, the Qri dataset is published to qri.cloud where anyone can see the version history and pull/download the latest version of the data!

A simple node script pulls a CSV from google sheets, then commits a new version of the Qri Dataset

The dataset now has a home on the web at https://qri.cloud/chriswhong/best-pizza-place-in-every-state. Qri.cloud is not just hosting my data, it’s making it discoverable in search engines and providing access to the readme, metadata, and schema information that lives along with my table.

qri.cloud gives each dataset a landing page, history, integration shortcuts, issue queues, etc. It’s like your own personal data publishing portal!

This dataset, and all of the different incremental versions are now available for pulling and downloading on qri.cloud. You can even follow the dataset to get an email notification each time I update it. If the dataset becomes more popular, I’ll have an incentive to make sure it’s more complete, and maybe add columns that will be helpful to other users.

The full history of each dataset can be inspected so users can see what changed and when. Each version is immutable, signed, and timestamped.

Towards a Google Sheets Connector

This example is a proof of concept for moving data between Google Sheets and Qri, and lays some of the groundwork for a proper connector between the two platforms. What could this look like? One idea is to use Qri Desktop as the jumping off point, allowing the user to quickly create a new dataset linked to a Google Sheet, and commit new versions with a click.

Another idea is to have the integration UI live in the Google Sheets UI (or as a browser plugin) which watches the sheet for changes and allows for a quick commit of a new Qri version.

What would you want in an Google Sheets/ Qri integration? Give us a yell on twitter or drop into our Discord and let us know what features you’d find helpful. Thanks for reading!

--

--

Chris Whong
qri.io
Editor for

Urbanist, Technologist, Mapmaker. Developer Relations @Mapbox