Automate ArchiveBox with Google Spreadsheet to Backup your internet

Manfred Touron
Passion & Madness
Published in
2 min readMar 10, 2019

Introduction

I just discovered ArchiveBox on my GitHub feed.

ArchiveBox allows you to store copies of webpages at a specific time.

It is still new for me, but from what I see, my workflow will be something like this:

  • to store copies of interesting webpages that I may want to read again later, i.e., my bookmarks; and then, use these archives:
  • as a backup link when the main page is outdated
  • as a way of comparing how the webpage would have changed with time (diff)
  • to list my interesting links
  • to periodically monitor changes of webpages I want to follow over time, i.e., my public social profiles, or this site web

To make it easier for me to maintain, I want to update a Google Spreadsheet and never touch a shell anymore.

My setup

First, write some links on a Google Spreadsheet document.

Then, publish the document in CSV format.

And finally, create a script that will fetch the links in CSV and run the archiver against those URLs.

My custom Makefile and docker-compose.yml files:

Run make loop in a tmux or another process-backgrounding method.

Result

I now only need to add new links to a Google Spreadsheet and let my script do the rest.

Originally published at manfred.life.

--

--