Pro Tips: How to Run a Lotus Node on Filecoin

Protofire.io

Published in

Protofire Blog

7 min readAug 4, 2023

Here’s all you need to know!

Intro

Running nodes is no easy task — it takes a lot of resources, constant monitoring and maintenance. We at Glif, the dev community of Protofire and Filecoin, have been running Filecoin RPC nodes since 2020, and during this time, we’ve had to overcome various challenges which gave us the experience and know-how we have today.

Now, we want to share this knowledge to help other Filecoin node runners become more efficient and make their life easier.

Let’s start with a brief overview of Filecoin’s node clients!

Filecoin node clients overview

By June 2023, 3 clients are implementing Filecoin Protocol Specification:

Lotus is the eldest Golang implementation, maintained by the Protocol Labs team. The lotus daemon is available as a client for users to store data, the miner daemon is for storage providers, and the worker daemon allows storage-providers to split tasks across multiple computers. Additionally, it supports the Filecoin virtual machine (FVM) — a runtime environment for smart contracts on the Filecoin network.
Venus is the Golang implementation focused on the easy and bulletproof setup for the Storage Providers.
Forest is an implementation of Filecoin written in Rust by ChainSafe. The implementation will take a modular approach to build a full Filecoin node based on the Filecoin Protocol Specification and will include a virtual machine, blockchain, and node system modules.

In this piece, we focus on the Lotus implementation because it is the most mature one out of the three above and combines FVM and Filecoin-native RPC calls. You can check the full list via OpenRPC Playground.

Let’s quickly observe the main modes of the Lotus node.

Lotus node modes

Nodes differ by ledger-state usage:

Light nodes do not store any ledger or state and mainly store block header data. They are easy to spin up. However, they should be connected to a full node. The light node is very undemanding — one CPU and around 150 MB of RAM.
Full nodes (on AWS, one of your best options would be r6gd.xlarge instances with 4 CPU and 32GB of RAM). Full nodes store only a snapshot of the current state needed to validate new transactions. Blocks no longer needed can be constantly pruned.
- In Ethereum, this includes the last 128 blocks, which occupy around 650 GB at the beginning. Constant pruning brings the total storage back down to the original 650 GB.
- In Filecoin, this includes the last 2000 epochs or the last 16.67 hours of chain data.
Archive nodes store all data starting from a given point and store the history of all blockchain states. These huge beasts require a lot of resources to run. We use r6g.12xlarge — 48 CPU, 384GB RAM, and RAID0 array made of multiple EBS disks.
- A full archive node normally stores all data beginning with the chain’s genesis, including full chain history and entire block trace.
- A partial archive node is also possible, starting from a particular date / block / epoch.

How to turn on FVM support

You need to enable ETH methods on RPC via an environment variable or config file. The main variable is LOTUS_FEVM_ENABLEETHRPC, but there are a lot of other variables for accurate setup. Here’s a link to the official documentation to make this tip short.

Seven Pro Tips for Running Lotus Nodes

We have already spoken about resource usage. Now let’s go deeper into Lotus maintenance.

Tip #1 | Use Prometheus metrics

Like all modern applications, Lotus has a Prometheus exporter under the /debug/metrics endpoint. You can use the Grafana + Prometheus stack. You can find there useful metrics for the requests count, latency and overall node health metrics. Check our Grafana dashboards and plug them into your installation. By the way, here’s a new insight from the Lotus team — standardized dashboards for Lotus are one of the goals for the Q3 of 2023.

Tip #2 | Aggregating logs

First of all, why is aggregating logs useful? Because it will make your lives easier. For example, something goes wrong on one of the several instances of your app. Just imagine how much time you will spend going manually through the logs. Aggregation can help you with that.

You can use our recommended setup: K8S Fluentbit -> Opensearch -> OpenSearch Dashboards via our helm chart or use your own setup.

The easiest way to start aggregating logs is to enable JSON format logs in Lotus via GOLOG_LOG_FMT=json environment variable and start the aggregation using your favourite tool.

For example, we use stack and its most powerful feature is Fluentbit’s built-in JSON parsing support. In this case, simply point in against your log file in JSON format to get your fully parsed, structured and searchable data loaded into Opensearch without any manual overhead.

Another tip here, take a look at the Logstash_Prefix_Key and Logstash_DateFormat values for the opensearch upload config. It will help you to separate your Lotus installations logs by date and name. We use the following:

Logstash_Prefix_Key kubernetes[‘pod_name’]
Logstash_DateFormat %Y.%m

Tip #3 | Use the Lotus auth model

When you set up a local Lotus full node or Lotus lite, you will definitely face a Lotus auth model.

There is a great page from the Lotus team about it, and we will do just a small recap here:

Auth model is based on the JWT tokens.
There are 4 types of operations — read, write, sign, and admin.
Public endpoints don’t require authorization.
Public endpoints have only read and only one write operation — MpoolPush.
You can ‘unlock’ write and sign operations locally via the Lotus-lite node. Here’s the tip from the official documentation.

Tip #4 | Setup a proxy for your node with Lotus Gateway

Lotus Gateway is a superb and undervalued binary. Why? Because it is easy to set up, can be built with Lotus, and works as a proxy to a Lotus node.

It supports only read, MpooPush and all Ethereum operations, which is useful for RPC runners.

Gateway has built-in tools for limiting the following topics:

Maximum API request size accepted by the JSON RPC server
Maximum duration allowable for tipset lookback
Maximum number of blocks to search back through for message inclusion
Rate-limit API calls
Rate-limit API calls per each connection
The maximum time to wait for the rate limit before returning an error to clients
The number of incoming connections to accept from a single IP per minute

Tip #5 | Full node and the roots of 2k blocks requirement

According to the FIlecoin specification, a node needs the latest 900 blocks for Filecoin finality, but it could be a faulty block in this latest block, and to report it to the blockchain, the node needs the previous 900 blocks. At the edge case, it will be 900+900=1800 blocks, thus, all light-weight snapshots include around 2000k blocks.

Tip #6 | Docker images observation

Right now, there are two docker implementations:

Official docker image from the Lotus team with two final targets:

Lotus binary — suitable for the basic usage
Lotus binary with all addons like wallet, miner, etc, is a good choice for the local testing of all binaries together

2. Glif image, which is customized specifically for the RPC needs:

Run Lotus binary with Lotus gateway or without it
Spin up Lotus with a custom home folder
Spin up Lotus with a persistent node id
Start Lotus in Lite-mode

Tip #7 | Join Lava: Become an RPC Provider on Lava

Lava is a decentralized network of RPC & API providers through which FVM developers can fetch and send blockchain data in a private & reliable way.RPC Node Runners (aka RPC Providers) join Lava to support FVM developers who must send relays to the Filecoin blockchain. RPC Providers are automatically paired with consumers based on parameters like geolocation and QoS.

Why become an RPC Provider?
- Might be eligible for future participation in Lava’s incentive program
- Get early access to features & tools
- Join an active provider community & DAOIf you’re already running a Lotus node, you can easily become an RPC Provider on Lava by following this setup guide.

Conclusion

If you wish to run a node on Filecoin, we recommend you do it via Lotus because Lotus implementation has proven to be a convenient go-to tool in our experience.

Set up Prometheus to collect metrics, aggregate logs, make up a proxy for your node with Lotus Gateway and become an RPC Provider on Lava to spin up Lotus-lite locally and start to experiment with your ideas.

This is it for today. Stay tuned for more posts about blockchain networks.

Authors: Uladzislau Muraveika — Lead DevOps Engineer @ Protofire