Fast Node Sync with Tezos Tarballs

James Orcutt
The Aleph
Published in
6 min readFeb 15, 2022
University of Illinois Library, CC BY 2.0 https://creativecommons.org/licenses/by/2.0, via Wikimedia Commons

Full and rolling Tezos snapshots are great, but can take hours to get a Tezos node synced and ready to use.

What if we strategically copied a filesystem off of a synced node, validated it, and packaged it?

We call this a Tezos tarball.

TLDR

Get Tezos tarballs at xtz-shots.io.

Why should I use a tarball?

Syncing takes time! Using traditional snapshots every block is validated one-by-one. With a Tezos snapshot ( .rolling or .full ) depending on how old it is, syncing could take hours or even weeks.

A 12-hour-old Tezos tarball from a Tezos archive node will give you a fully synced archive node in 40 minutes.

In comparison a 12 hour old Tezos full snapshot could take hours to get your Tezos node running with an optimal hardware configuration. Perhaps days or even weeks with less capable hardware.

A 20 minute old Tezos Tarball from a Tezos node with a rolling history mode will have you fully synced in 9 minutes. Even with some of the most recent rolling Tezos snapshots on the internet this process could take hours.

With a Tezos tarball, as soon as one tarball is created, another one is started. This provides the highest frequency of tarballs and therefore the shortest sync times possible.

An interesting benefit to using a tarball.

The Octez node in rolling storage is known for not garbage-collecting its state. This is being worked on. But if you run Tezos nodes in rolling storage today, their performance will degrade over time due to accumulation of state.

At Oxhead Alpha we run our RPC nodes in tezos-k8s. By restoring from a rolling tarball each time a pod restarts we avoid the performance associated with old garbage data building up.

OK, but what really is a Tezos Tarball?

A tarball is an old term from tape backup times in the Linux world. It’s simply a single file that contains the file structure and contents of a directory. This tarball can be expanded and reproduces the original directory structure as it was captured.

This is how we make a Tezos tarball:

  • We take an EBS volume snapshot of a running Tezos node and restore this snapshot to a new filesystem for artifact creation.
  • We capture the node directory excluding sensitive files like peers.json and identity.json on this newly restored filesystem.
  • We then LZ4 this directory at the lowest compression. This saves a great deal of space and adds a trivial amount of time when expanding versus expanding an uncompressed tarball. This is the final step for an archive tarball.
$ lz4 -d tezos-ithacanet-archive-tarball-88232.lz4 | tar -x
$ tree node
node
└── data
├── context
│ ├── index
│ │ ├── data
│ │ ├── lock
│ │ └── log
│ ├── store.branches
│ ├── store.dict
│ └── store.pack
├── store
│ ├── chain_NetXnHfVqm9ie
│ │ ├── alternate_heads
│ │ ├── caboose
│ │ ├── cemented
│ │ │ ├── 0_1
│ │ │ ├── 12290_16385
│ │ │ ├── 16386_20481
│ │ │ ├── ...
│ │ │ ├── hash_index
│ │ │ │ └── index
│ │ │ │ ├── data
│ │ │ │ ├── lock
│ │ │ │ └── log
│ │ │ ├── level_index
│ │ │ │ └── index
│ │ │ │ ├── data
│ │ │ │ ├── lock
│ │ │ │ └── log
│ │ │ └── metadata
│ │ │ ├── 0_1.zip
│ │ │ ├── 12290_16385.zip
│ │ │ ├── 16386_20481.zip
│ │ │ ...
│ │ ├── cementing_highwatermark
│ │ ├── checkpoint
│ │ ├── config.json
│ │ ├── current_head
│ │ ├── forked_chains
│ │ ├── genesis
│ │ ├── invalid_blocks
│ │ ├── lock
│ │ ├── protocol_levels
│ │ ├── ro_floating
│ │ │ ├── blocks
│ │ │ └── index
│ │ │ └── index
│ │ │ ├── data
│ │ │ ├── lock
│ │ │ └── log
│ │ ├── rw_floating
│ │ │ ├── blocks
│ │ │ └── index
│ │ │ └── index
│ │ │ ├── lock
│ │ │ └── log
│ │ ├── savepoint
│ │ ├── status
│ │ └── target
│ └── protocols
│ ├── Ps9mPmXaRzmzk35gbAYNCAw6UXdE2qoABTHbN2oEEc1qM7CwT9P
│ ├── PsBABY5HQTSkA4297zNHfsZNKtxULfL18y95qb3m53QJiXGmrbU
| ...
└── version.json

Rolling tarballs are created in a similar way.

  • We export a traditional Tezos snapshot (.rolling file) from a Tezos node with rolling history mode.
  • Then we restore this Tezos rolling snapshot to a new volume, and capture the node folder, tarball it, and LZ4 the tarball just like an archive node tarball artifact is. Both the .rolling snapshot used and the rolling tarball are provided.
$ lz4 -d tezos-ithacanet-rolling-tarball-88348.lz4 | tar -x
$ tree node
node
└── data
├── context
│ ├── index
│ │ ├── lock
│ │ └── log
│ ├── store.branches
│ ├── store.dict
│ └── store.pack
├── store
│ ├── chain_NetXnHfVqm9ie
│ │ ├── alternate_heads
│ │ ├── caboose
│ │ ├── cemented
│ │ │ ├── hash_index
│ │ │ │ └── index
│ │ │ │ ├── lock
│ │ │ │ └── log
│ │ │ └── level_index
│ │ │ └── index
│ │ │ ├── lock
│ │ │ └── log
│ │ ├── cementing_highwatermark
│ │ ├── checkpoint
│ │ ├── config.json
│ │ ├── current_head
│ │ ├── forked_chains
│ │ ├── genesis
│ │ ├── invalid_blocks
│ │ ├── protocol_levels
│ │ ├── ro_floating
│ │ │ ├── blocks
│ │ │ └── index
│ │ │ └── index
│ │ │ ├── lock
│ │ │ └── log
│ │ ├── rw_floating
│ │ │ ├── blocks
│ │ │ └── index
│ │ │ └── index
│ │ │ ├── lock
│ │ │ └── log
│ │ ├── savepoint
│ │ ├── status
│ │ └── target
│ └── protocols
│ └── Psithaca2MLRFYargivpo7YvUr7wUDqyxrdhC5CQq78mRvimz6A
└── version.json

We then provide these artifacts at xtz-shots.io for public use.

We’re able to provide mainnet archive artifacts every 12 hours, mainnet rolling artifacts every 3 hours; hangzhounet, and ithacanet every 20 minutes.

Note: These times will increase over time as the chains grow! Even from the time of writing this article they may increase due to the increase in block data.

For those of you that aren’t ready for Tezos Tarballs we still provide .rolling Tezos snapshots of our nodes.

Archive snapshots do not exist. If you want a mainnet archive node you can either sync from scratch, which will take weeks, or use our archive tarballs.

For consistency checks we provide metadata for each artifact produced. An example can be seen here for the latest mainnet tarball at the time of writing.

{
"block_hash": "BMLDR6seD9RQpDfji5WC4ixHRmECXoUZVDq9i9PqwbURJYZd6cD",
"block_height": "2106524",
"block_timestamp": "2022-02-10T19:31:24Z",
"archive_tarball_filename": "tezos-mainnet-archive-tarball-2106524.lz4",
"filesize": "282G ",
"sha256": "9f9f23f567f02912fbbdabb7d1d4b01b5e53d637e516335a51a034e2e816260e",
"tezos_version": "0e7a0e9a (2021-11-15 11:05:56 +0100) (11.0)",
"chain_name": "mainnet",
"history_mode": "archive",
"artifact_type": "tarball"
}

We hope this is also a start to the standardization of Tezos artifact metadata.

Usage

We provide the following convenient permalinks for the latest version of each type of artifact. For example here are the links to the latest mainnet artifacts. There are also links for ithacanet, hanzghounet, and we will also provide tarballs for new chains going forward.

We also provide permalinks for the metadata if you just want to know information about our latest artifacts. For example here are the links for our latest mainnet artifact’s metadata.

To download and expand a tarball, you’ll need to make sure you’re on a system that can handle tarball expansion (or install tar) and install LZ4.

Here is an example of how to download and expand the latest mainnet archive tarball to /var/tezos . You can change this to whatever directory you want the node folder to exist.

curl -LfsS "https://mainnet.xtz-shots.io/archive-tarball" | lz4 -d | tar -x -C "/var/tezos"

I’m sold! Where can I get them?

Check out xtz-shots.io for all of your Tezos tarball and snapshot needs or talk to us in the Tezos Dev Slack!

Disclaimer about trust.

By using our artifacts you’re trusting us as the source of your Tezos data.

Tezos snapshots are structured packages of Tezos data. When importing a Tezos snapshot, Octez rebuilds the context and verifies its soundness block by block. Tarballs in contrast are a raw representation of the node storage.

Restoring a node from a tarball is much faster than a Tezos snapshot, but this comes as the expense of safety. You must trust us as a reliable tarball source. Please consider your risk profile when opting for one or the other.

--

--