Single-Host Instance Orchestration, Part 1: Deploying and Managing Instances

Building Nexus, an orchestrator for operating arbitrary numbers of IPFS private networks

Robert Lin
Oct 29 · 5 min read

This is part 1 of a series of articles based on a post available on my website, IPFS Private Network Node Orchestration, which I wrote a while back but never got around to publishing. This series briefly covers my work on Nexus during my time working at RTrade Technologies, with an emphasis on:

  • using Docker to deploy instances and manage them
  • implementing access control to instances (namely their endpoints)
  • exposing an API for using this functionality
  • testing a Golang service

These articles won’t go to very deep into IPFS, which for context stands for the Interplanetary Filesystem, though there will be references to some of its functionality. Part 1 covers deploying and managing instances (which I refer to as “nodes”, in the context of IPFS). The rest of the series covers:

Note that Medium code blocks are kind of bad — they are formatted much better on my original post.

Context

Nexus is an open-source service that handles on-demand deployment, resource management, metadata persistence, and fine-grained access control for arbitrary private IPFS networks running within Docker containers on RTrade infrastructure. RTrade wanted to explore offering a service that would provide a set of IPFS nodes, hosted on our end, that customers can use to bootstrap their private networks — groups of IPFS nodes that only talk to each other. It’s a bit of a underdocumented feature (a quick search for “ipfs private networks” only surfaces blog posts from individuals about how to manually deploy such a network) but it seemed like it had its use cases — for example, a business could leverage a private network that used RTrade-hosted nodes as backup nodes of sorts.

Deploying Nodes

Deploying nodes within containers was the most obvious choice — the tech is kind of designed for situations like this, and I’ve had some experience working directly with the Docker API through my work on Inertia.

This functionality is neatly encapsualted in package Nexus/ipfs within an interface, ipfs.NodeClient, which exposes some faily self-explanatory C.R.U.D. functions to manipulate nodes directly:

type NodeClient interface {
Nodes(ctx context.Context) (nodes []*NodeInfo, err error)
CreateNode(ctx context.Context, n *NodeInfo, opts NodeOpts) (err error)
UpdateNode(ctx context.Context, n *NodeInfo) (err error)
StopNode(ctx context.Context, n *NodeInfo) (err error)
RemoveNode(ctx context.Context, network string) (err error)
NodeStats(ctx context.Context, n *NodeInfo) (stats NodeStats, err error)
Watch(ctx context.Context) (<-chan Event, <-chan error)
}

The intention of this API is purely to handle the “how” of node deployment, and not to handle the business logic that goes on to determine the when and where of deployment. Structures like NodeInfo and NodeOpts expose node configuration that can be used by upper layers:

type NodeInfo struct {
NetworkID string `json:"network_id"`
JobID string `json:"job_id"`
Ports NodePorts `json:"ports"`
Resources NodeResources `json:"resources"`
// Metadata set by node client:
// DockerID is the ID of the node's Docker container
DockerID string `json:"docker_id"`
// ContainerName is the name of the node's Docker container
ContainerName string `json:"container_id"`
// DataDir is the path to the directory holding all data relevant to this
// IPFS node
DataDir string `json:"data_dir"`
// BootstrapPeers lists the peers this node was bootstrapped onto upon init
BootstrapPeers []string `json:"bootstrap_peers"`
}

The node creation process goes roughly as follows:

  1. Initialize node assets on the filesystem — most notably this includes:
  • writing the given “swarm key” (used for identifying a private network) to disk for the node
  • generating an entrypoint script that caps resources as required

2. Setting up configuration, creating the container, and getting the container running — this part primarly imitates your standard docker container create, etc. commands in *docker/client.Client, edited for brevity:

resp, err := c.d.ContainerCreate(ctx, containerConfig, containerHostConfig, nil, n.ContainerName)
if err != nil { /* ... */ }
l.Infow("container created", "build.duration", time.Since(start), "container.id", resp.ID)
if err := c.d.ContainerStart(ctx, n.DockerID, types.ContainerStartOptions{}); err != nil {
go c.d.ContainerRemove(ctx, n.ContainerName, types.ContainerRemoveOptions{Force: true})
return fmt.Errorf("failed to start ipfs node: %s", err.Error())
}
// waitForNode scans container output for readiness indicator, and errors on
// context expiry. See https://github.com/RTradeLtd/Nexus/blob/master/ipfs/client_utils.go#L22:18
if err := c.waitForNode(ctx, n.DockerID); err != nil { /* ... */ }
// run post-startup commands in the container (in this case, bootstrap peers)
// containerExec is a wrapper around ContainerExecCreate and ContainerExecStart
// See https://github.com/RTradeLtd/Nexus/blob/master/ipfs/client_utils.go#L141:18
c.containerExec(ctx, dockerID, []string{"ipfs", "bootstrap", "rm", "--all"})
c.containerExec(ctx, dockerID, append([]string{"ipfs", "bootstrap", "add"}, peers...))

3. Once the node daemon is ready (detected by scanning the output), bootstrap the node against existing peers if any peers are configured

Some node configuration is embedded into the container metadata, which makes it possible to recover the configuration from a running container. This allows the orchestrator to bootstrap itself after a restart, and is used by NodeClient::Watch() to log and act upon node events (for example, if a node crashes).

This interface neatly abstracts away the gnarly work makes it very easy to generate a mock for testing, which I will talk about later in this article. This particular example is from TestOrchestrator_NetworkUp, edited for brevity:

client := &mock.FakeNodeClient{}
o := &Orchestrator{
Registry: registry.New(l, tt.fields.regPorts),
client: client,
address: "127.0.0.1",
}
if tt.createErr {
client.CreateNodeReturns(errors.New("oh no"))
}
if _, err := o.NetworkUp(context.Background(), tt.args.network); (err != nil) != tt.wantErr {
t.Errorf("Orchestrator.NetworkUp() error = %v, wantErr %v", err, tt.wantErr)
}

Orchestrating Nodes

The core part of Nexus is the predictably named orchestrator.Orchestrator, which exposes an interface very similar to that of ipfs.NodeClient, except for more high-level “networks”. A bit more work goes on the the orchestrator - for example, since ipfs.NodeClient does very straight-forward node creation given a set of parameters, port allocation and database management are left to the orchestrator. Managed in memory are two registries that cache the state of the IPFS networks deployed on the server to help it do this:

  • registry.Registry, which basically provides cached information about active containers for faster access than constantly querying dockerd. It is treated as the live state, and is particularly important for access control, which needs to query container data very often (more on that later).
  • network.Registry, which accepts a set of ports from configuration that the orchestrator can allocate, and when requested scans ports to provide an available. This is used several times during node creation - each node requires a few ports available to expose APIs and do things.

The orchestrator also has access to the RTrade databasse, which do all the normal making-sure-a-customer-has-sufficient-currency work and so on, and syncs the state of deployed networks back to the database. It also does things like bootstrap networks on startup that should be online that aren’t. Overall it is fairly straight-forward — most of the work is encapsulated within other components, particularly ipfs.NodeClient.

The functionality of the orchestrator is exposed by a gRPC API, which I talk about a bit more in Exposing an API.


Part 2 will cover providing and controlling access to these nodes/instances.

    Robert Lin

    Written by

    📊 more posts and other stuff at bobheadxi.dev

    Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
    Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
    Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade