Automate ArchiveBox with Google Spreadsheet to Backup your internet
☰
Introduction
I just discovered ArchiveBox on my GitHub feed.
ArchiveBox allows you to store copies of webpages at a specific time.
It is still new for me, but from what I see, my workflow will be something like this:
- to store copies of interesting webpages that I may want to read again later, i.e., my bookmarks; and then, use these archives:
- as a backup link when the main page is outdated
- as a way of comparing how the webpage would have changed with time (diff)
- to list my interesting links
- to periodically monitor changes of webpages I want to follow over time, i.e., my public social profiles, or this site web
To make it easier for me to maintain, I want to update a Google Spreadsheet and never touch a shell anymore.
My setup
First, write some links on a Google Spreadsheet document.
Then, publish the document in CSV format.
And finally, create a script that will fetch the links in CSV and run the archiver against those URLs.
My custom Makefile
and docker-compose.yml
files:
Run make loop
in a tmux
or another process-backgrounding method.
Result
I now only need to add new links to a Google Spreadsheet and let my script do the rest.
Originally published at manfred.life.