The TezEdge Explorer — Visualizing the flow of data in a Tezos node

Published in

TezEdge

4 min readJun 11, 2020

Every kind of software may run into some problems. This is particularly true for more complex systems which provide an active service. In order to quickly troubleshoot these issues and maintain the service, we want to know whether the software is doing something it shouldn’t be doing and how serious of a problem it is. We also want to know the time when the issue happened as well as its location within the code.

When a Tezos node is up and running, there is a large amount of data flowing between the node and the rest of the network. Although we can record the data, browsing through it is difficult due to its sheer volume.

We want to be able to quickly find certain events or items within the data. For example, when the node exchanges messages with other peers via the P2P network, we want to know which peers sent a connection message to our node, as well as the message details in its metadata.

Utilizing filters, we can examine a particular section of the data from various angles in order to accelerate the debugging process. Using a conventional database system would significantly hinder the performance of the node. We must consider faster, albeit simpler, storage solutions.

In order to store such a large amount of data, we chose RocksDB, which is a high performance key-value store. By key value, we mean that there is no high level structure of data (such as tables in relational databases or structures in NoSQL databases). A key value store is an association of a byte array representing the key with another byte array representing a value.

How we utilize indexes for filtering

To create a high performing filter, we need to utilize our own indexes. Conveniently for us, RocksDB holds its data sorted by its key, which provides us with a functionality that is necessary for fast data filtering.

RocksDB contains one more powerful mechanism called column families, which allows us to group data of the same type into their own “named columns”. This way, we can separate data indexes, allowing us to efficiently look up data with specific properties from the database itself.

At its core, the index is a very simple idea: combine part of the data with its associated key to carefully create an “index key” that will be inserted and sorted into its own column family. We use sequentially generated ID (meaning that each message is given a number representing order they were received) for actual data.

By combining, we mean prefixing the ID of the data with binary representation of the property we want to sort by, such as a particular type of a P2P message. Thanks to the sorting of RocksDB, all P2P messages with the same type are grouped together (and correctly ordered). We only need to retrieve messages of a specific type. For this, we utilize the “prefix iterator”, a special type of iterator that returns values with a specified prefix value.

But to perform more advanced filters and queries, like finding messages that are either connection messages (used for establishing encrypted connection between nodes) or metadata message (containing details about nodes after establishing encrypted connection), an algorithmic approach is required. However, in such a situation, we can just use the merge algorithm (used in the merge sort) to correctly join and sort multiple prefix iterators.

To search by multiple properties, such as searching for messages that are connection messages AND are incoming, we should imagine how our data is structured in the database. All keys are unique, sorted and we constructed the index, so they would also be grouped if they share a property. This allows us to use prefix iterators like sorted sets of IDs and performing complex filters by building an set intersection (finding IDs that are present in both iterators).

After we find our desired IDs from our indexes, all we need is just to load complete data from primary column families and present them to the user.

You can browse through the P2P network utilizing a variety of filters, for example:

Visualizing log files from the OCaml and Rust Tezos nodes

The logs are information about the node’s actions. Some logs are error logs, which are of particular interest to developers as they directly affect the node’s operation. They inform us about the severity of the problem, its approximate location within the code and the time when it happened.

When you have a Tezos (either OCaml or Rust-based) node up and running, a log will be recorded in the terminal that runs the node. In case of an issue, we want to be able to quickly browse through the logs and find the root cause. The TezEdge debugger processes the logs into a more manageable format, stores them inside our database and builds indexes on top of them.

The TezEdge Explorer allows us to easily switch and compare data between multiple Tezos nodes.

In the lower left corner of the interface, you can select from a drop-down menu of servers (Rust and two OCaml nodes) and examine the logs, network and RPCs of each node.

To try out the aforementioned features, visit the TezEdge.com website.

We thank you for your time and hope that you have enjoyed reading this article. To read more about Tezos and the TezEdge node, please visit our documentation, subscribe to our Medium, follow us on Twitter or visit our GitHub.

The TezEdge Explorer — Visualizing the flow of data in a Tezos node

Written by Juraj Selep