How to Keep Your IPFS Nodes Connected to Ensure Fast Content Discovery
An IPFS Tutorial
- This guide assumes the reader already has multiple IPFS nodes running on Ubuntu 16 systems.
This guide is written for any of the following people:
- People who want to make their IPFS nodes discover content faster from their other nodes
- People who run IPFS gateways alongside their IPFS nodes.
- People who have IPFS nodes that are repeatedly disconnecting from or “forgetting” about each other.
IPFS currently doesn’t provide mechanisms that allow hosts to keep all nodes they control perpetually connected to each other. This creates scenarios where even if we bootstrap the nodes together on startup, they can eventually “forget” about each other and become disconnected. This can lead to slower content discovery and prevent content from being discovered entirely.
This can be a problem when a product hosts its own gateway and points users to it for retrieving content hosted on the product’s own nodes. If the gateway isn’t directly connected to the product’s nodes, then users may find themselves waiting quite a while for their content to load. Not a great user experience.
So how can we fix this?
Step 1 — Acquire your node multiAddresses
Open up the command line on each machine hosting an IPFS node and run:
Your response should look something like this:
We primarily care about the “Addresses” array. These contain the “multiAddress” values that outside IPFS nodes can use to connect to. You might have some duplicate results, that’s okay. The key thing here is to take note of the entries containing your external IP addresses, and NOT the entries with your local IP address in them. If you have IPv6 enabled on ALL of your nodes, copy that multiAddress, if you don’t, copy the IPv4 multiAddress.
Step 2 — Connect your IPFS nodes
Now that we have our multiAddresses, let’s connect our nodes together.
For this example, we’ll pretend we have Node A, and Node B. In reality, this works for as multiple nodes.
From the command line in Node A, run:
ipfs swarm connect /ip4/BBB.BBB.BBB.BBB/tcp/4001/ipfs/NodeBID
(replacing the above example multiAddress with Node B’s IPv4 multiAddress)
or if you’re connecting via IPv6, run:
ipfs swarm connect /ip6/BBBB:BBBB:BBBB:BBBB:BBBB:BBBB:BBBB:BBBB/tcp/4001/ipfs/NodeBID
(replacing the above example multiAddress with Node B’s IPv6 multiAddress)
We should receive the following results in the terminal:
connect NodeBID success
You can verify this worked by running the following on each node:
ipfs swarm peers
On Node A, you should see Node B’s multiAddress in the returned list, and on Node B, you should see Node A’s multiAddress in the returned list.
Now that our two nodes are connected together. Content discovery should be near instant. Instead of having to go through multiple layers of nodes before Node A discovers that Node B has the content it’s looking for, Node B will be one of the first nodes asked for the content and can instantly start providing it (the same works in reverse).
Step 3 — Let’s automate things
Connecting our nodes manually via the command line is cool, but how do we automate this process so our nodes can stay connected on their own?
Using linux services / timers, we can do just that!
(For these examples, let’s pretend we have a gateway node, and that we want our content hosting nodes automatically connected to the gateway node)
For every node that you want to connect to the gateway, add the following two files:
Description=Job that periodically connects this IPFS node to the gateway node
ExecStart=/home/yourUserName/go/bin/ipfs swarm connect /ip4/GGG.GGG.GGG.GGG/tcp/4001/ipfs/gatewayID
*Note* — In the above file, you’ll need to input custom values for:
- Under ExecStart, your ipfs executable path will depend on where you installed Go / the ipfs executable.
- Under ExecStart, the multiAddress will be one of the values retrieved form running “ipfs id” on your gateway node. You can also use an IPv6 multiAddress here as well.
- Under Environment, you’ll need to replace this with the location you have your IPFS repo installed. If you can get this value by running “ipfs repo stat” in the command line and copying the “RepoPath” value.
Description=Timer that periodically triggers gateway-connector.service
In the above file, “OnBootSec” is the amount of time your machine waits to start the timer after it boots up. The OnUnitActiveSec is the amount of time between each execution of gateway-connector.service.
Step 4 — Let’s run our automation
Now that we’ve created our automation files, let’s enable / run them.
From the command line on each of the nodes we added our automation files to, run:
sudo systemctl enable gateway-connector.timer
sudo systemctl start gateway-connector.timer
To double checked this worked, run:
You should see an entry for your gateway connector service. You can also check the status of its last execution attempt by running:
systemctl status gateway-connector
Now, you should have everything up and running. From this point on, when requesting content from nodes in your network (from one of your other nodes), everything should go much faster. That’s the magic of keeping your nodes connected via swarm!
As always, feel free to reach out to me at email@example.com if you have any questions on this post or IPFS in general!