Running a full node with Geth (BSC)

Pieter Callewaert
7 min readSep 23, 2021

--

Update January 2022

My project ended some time ago, and so did the server, so this information can be outdated already. I noticed in the last weeks that the disks were getting too small. I think you need at least 1.5 TB today and would choose now 2 * 1TB in RAID 0 if I have to start over.

Also, you don’t have to start the sync from scratch now. Binance has been releasing snapshots of a full node almost daily now. You can find them on https://github.com/binance-chain/bsc-snapshots

I wanted to set up my own full node for the Binance Smart Chain. But during this process I was confused what was going on during the fast sync process and found out I was not the only one.

In this article we focus on Geth for Binance Smart Chain, but most of the information also checks out for the Ethereum network. The tools for BSC are originally forked from the Ethereum project.

To set up a full node for the Binance Smart Chain, you have to start with downloading Geth and the configuration file for the network you want. This can be mainnet or testnet, but in this article I’ll be talking about mainnet.

Hardware

First of all, you need a beefy machine. Especially the disks need to be fast. Not in sequential speeds, but in random operations. The consensus today is that you need NVMe drives. With the current version of Geth (1.1.2 for BSC) the data size takes ±580GB, but this grows rapidly. In my case I used 2 drives of 512 GB, and placed them in RAID 0. The CPU is a quad core Intel Xeon E3–1275 v5 and 64 GB of RAM. The RAM seems overkill as Geth is not using a lot, but at least Linux can use it for file cache. While syncing I noticed the network peeked ad 300Mbits up and down, so if you think about hosting at home, take this in consideration.

I was lucky and found this server at Hetzner Auction for pretty cheap. I was not sure it would be fast enough, as they are mostly older servers, but after trying it anyway it was alright. And If the server would not have been fast enough I could cancel it in the first 14 days and get the money back.

Installation

First I had to install an Operating System, so I installed Ubuntu 20.04 LTS as this is my preferred OS for my servers. I did a pretty basic install, but did change the partition layout. I used software RAID to stripe the 2 NVMe drives, and then I made 3 partitions. /boot with 512 MB with ext3 filesystem, / with 32GB with ext4, and with the rest of the disk space I created a /data partition with xfs. In my past experiences I always had the best performance with this filesystem for large databases, so I was hoping this would work also for Geth. No hard numbers unfortunately.

Setting up Geth is pretty basic, and well explained on the README, but here is how I did it:

You can find all the needed files on https://github.com/binance-chain/bsc/releases. You will need the binary for your operating system and the configuration files for the network. In the example below I assume the user is ubuntu.

sudo mkdir /data/bsc
sudo chown ubuntu:ubuntu /data/bsc
cd /data/bsc
wget https://github.com/binance-chain/bsc/releases/download/v1.1.2/geth_linux
wget https://github.com/binance-chain/bsc/releases/download/v1.1.2/mainnet.zip
unzip mainnet.zip
mv geth_linux geth
chmod +x geth
./geth --datadir node init genesis.json

Et voila, that’s most of the installation work done. So we created a directory on the data partition, we downloaded the files we need from GitHub and unzip them and rename the binary. And as last we run the init command to initialize the local file structure.

To make my life easier, I’ve configured to run Geth with systemd. You can also run in screen or tmux if you are more familiar with it, but Systemd is probably the best option in the long run.

I’ve created this .service file at /etc/systemd/system/geth.service:

[Unit]
Description=Ethereum go client
After=syslog.target network.target
[Service]
User=ubuntu
Group=ubuntu
Type=simple
ExecStart=/data/bsc/geth --config /data/bsc/config.toml --datadir /data/bsc/node --nat extip:<external IP>
KillMode=process
KillSignal=SIGINT
TimeoutStopSec=90
Restart=on-failure
RestartSec=10s
[Install]
WantedBy=multi-user.target

I like to add the external IP as a flag. I find that when passing the external IP of the server, it will find peers faster to sync with. Also, don’t forget to open port 30311 for both TCP and UDP protocol from the internet if you have a firewall (and you should have one.)

Now you can start Geth as a service. You can also configure it to start up automatically when your server reboots.

systemctl systemd-reload # Reload systemd so the new service file is loaded
systemctl enable geth # To automatically start when rebooted
systemctl start geth

You can verify if it started successfully by running systemctl status geth.

Green is OK!

Geth also creates logs files, this can be found in the datadir that is configured. It rotates them every hour, but bsc.log is always the current log file.

One thing we did not touch yet, is the configuration file config.toml. It was packaged in the mainnet.zip and it contains the configuration for Geth.

You can find a good description of the fields by running ./geth --help but one important setting is the MaxPeers in the Node.P2P block. My advice is during the initial sync, keep this low. Between 10 and 25 seems OK, but keep it on the low side. If this number is high, it has to process too much data at once and it slows down the synchronization process. Check how your server is handling the load, and also check the bandwidth being used. This is to avoid handling too much duplicate data, which slows down your sync process.

By the way, a pretty nice tool to check your bandwidth is speedometer. You can run it with speedometer -t <network interface> -r <network interface>

Not much going on now

The synchronization process

Well, this is the part where most confusion happens. A very detailed explanation by karalabe on Github can be found here: https://github.com/ethereum/go-ethereum/issues/16251#issuecomment-371449572

But this how I experienced the fast sync process as I see it goes in 3 phases:

Phase 1: Download all the block data. This is the part you see a lot of network bandwidth and the disk space increasing rapidly. For me this part took around 12 hours. In this phase you will see mostly the Imported new block receipts log lines.

When this phase is done, you will see a massive drop in network bandwidth, but you will notice that the full node is lagging behind the network. The current block is every time around 50–100 blocks behind. You can check this by attaching a Geth console to the Geth server using ./geth attach node/geth.ipc. When you execute eth.syncing you’ll see the latest block on the network, and the current block where your node is at.

Phase 2: The lag is because the synchronization process is not done yet. Your server has most of the data, but it has to validate the data now. It will read the header and validate the block files. You don’t see happening a lot of the server, but this takes a lot of time. In the logs you will see mostly Imported new state entries. For my server I had to wait another 12 hours to complete this.

Phase 3: Well the node is in sync now, and will now properly function, but it will be slow. You will also see constantly in the logs Aborting state snapshot generation and Resuming state snapshot generation. So Geth is now creating a snapshot (which is recommended) that can be used for faster syncs to other nodes, or also when you want to prune your local installation (to lower the disk space). The snapshot process will interrupt every time when a new block is received however.

At some point I found that the node was not usable like this, and just wanted to have the snapshot process completed. I configured the MaxPeers in the config file to 0, and restarted Geth with systemctl restart geth. Now you only see Generating state snapshot. The node will be out of sync as it will not receive any new blocks anymore.

This will stop the process of receiving new blocks (so the node will be outdated), but it means the snapshot process is not interrupted anymore. It still took 50 hours to complete the snapshot eventually. Finally this message appeared in the logs:

msg=”Generated state snapshot” accounts=61,032,861 slots=603,991,878 storage=”47.45 GiB” elapsed=50h17m39.469s
Snapshot generation pulls almost 4k reads per second

Now I had a full node that had his snapshots ready, but was out of sync for 2 days. I restarted the Geth with MaxPeers to 10 and waited several hours until it was in sync again.

Now for the last time, I restarted Geth with a sensible value for MaxPeers and restarted Geth for the last time. Now we had to wait one last time to sync the past 48 hours we missed and now our node is finally synced and running!

TL;DR

You need a lot of patience when setting up a new full node. Sync can take several days. Disk IOPS is very important. Use the MaxPeers option to tune the process. Eventually it’s a cheap way to have “unlimited” calls to the BSC network.

--

--