iss-lab
Published in

iss-lab

Ethereum on DC/OS

Automating Private Blockchain Deployment

utomation of blockchain infrastructure is an evolving field. We have created a framework for DC/OS to deploy an Ethereum blockchain on your own cluster.

Who We Are

ISS (Institutional Shareholder Services) is the world’s leading provider of corporate governance and responsible investment solutions. We take pride in helping others by identifying ES&G (environmental, social, and governance) risks. Our team within ISS is the Technology Innovation Lab, and our mission is to enable innovation for our colleagues by augmenting their tools and infrastructure.

Why We Chose DC/OS

Apache Mesos enables a fault-tolerant and scalable infrastructure behind the applications and workloads that run on top of it. One of the core benefits is that it can reduce energy and environmental impacts by scheduling complimentary workloads on fewer machines than traditional deployment methodologies. This fits nicely with the ES&G principles we have at ISS. Mesosphere’s DC/OS (Datacenter Operating System) leverages Mesos to create a feature-rich platform for us to build our next generation solutions better and faster.

Investing in Blockchain

There is sometimes a stigma associated with the term “blockchain.” The technology has elicited divisive reactions, mainly due to cryptocurrencies specifically. There are many examples of people finding clandestine Bitcoin mining tasks on their servers prompting infrastructure teams to go on high alert. It’s easy to dismiss blockchain and empathize with critics considering problems like these:

Despite the storied past, many developers including us believe that there is a beautiful uniqueness in the core of the technology that will enable us to do new and exciting things. For this reason, we are willing to engage in the uphill battle of convincing our employers and consumers that this is a technology worth investing in.

Primer on Ethereum

Before diving into deploying and operating a private Ethereum blockchain network on DC/OS, I’ll first introduce some important terms and concepts.

Ethereum and Geth

For those unfamiliar, Ethereum is a decentralized platform for building applications using blockchain. Geth (a.k.a. go-ethereum) is an official Ethereum client that implements the Ethereum protocol.

Geth can be used to configure and run an Ethereum node in various modes of operation such as:

  • Full Ethereum MainNet
  • TestNet Network
  • Rinkeby Network
  • Private Network

For information about manually running Geth nodes, please check their documentation.

Consensus Protocols

The concensus algorithm or protocol used by a blockchain defines the mechanism used to determine what the next block in the chain will be. There are several commonly used protocols (“A (Short) Guide to Blockchain Consensus Protocols”) including “proof-of-work” and “proof-of-stake.” In Ethereum, the two most commonly used are:

  • Ethash: Ethereum’s implementation of the “proof-of-work” protocol in which miners race to complete difficult cryptographic puzzles.
  • Clique: Ethereum’s “proof-of-authority,” essentially an optimized version of the wider known “proof-of-stake” (“What is Proof of Authority Consensus?”) consensus protocol in which a group of pre-defined validators agree on which transactions and blocks are added to the chain.

For our purposes of running a private network disconnected from the larger Ethereum ecosystem, Clique makes the most sense because it does not require the raw computational and energy resources that are required for “proof-of-work.”

Sealers, Signers, Clients, and Transaction Nodes

There are two main ways to run a Geth node:

  • Sealer: Also called “signer” and “miner” nodes.
  • Client: Also called “transaction” or “tx” nodes.

Sealer is a term that we made for nodes that are configured to mine or “seal” blocks to its database. Practically speaking, this means running the geth executable with the --mine argument.

Client nodes connect to all other nodes in a cluster and replicate all of their data. The main responsibilty of these nodes is simply to expose RPC API endpoints and handle the load of responding to requests so that sealer node performance is not impacted.

Automating a Private Ethereum Network Deployment

Over the last few months, we have been working on a DC/OS framework that can configure and run a cluster of Geth nodes. It is now open source under the MIT License, and you can find the source code and documentation at github.com/iss-lab/dcos-ethereum. The framework is designed to run on DC/OS, and it is based on the dcos-commons framework SDK.

Cluster Components

A DC/OS framework generally consists of a scheduler and a set of tasks that are started and managed by that scheduler. In dcos-ethereum, the primary tasks are sealer and client Geth nodes, as well as a few others. The diagram below shows the typical topology for a cluster of three sealers and two clients.

Framework Components

Here are the components in more detail:

  • Bootnode: A very minimal geth node which only implements the peer-to-peer communication functionality. Its responsibilty is simply to allow other geth nodes to connect to it and inform them of other nodes in the cluster so that they can communicate directly.
  • Ethstats: A visual interface for tracking Ethereum network metrics and statistics. This is designed to be accessed via a web browser. More information can be found at github.com/cubedro/eth-netstats.
  • Sealers: These nodes connect to the bootnode to get peer information about other sealers and clients, they also send metrics to ethstats if configured. Once connected to other sealers and clients, blocks from the database will be replicated to all other nodes.
  • Clients: These generally behave exactly as sealers do, except without the mining functionality. They additionally expose RPC / API endpoints for external users and applications to consume.

Configuring and Deploying the Cluster

The latest installation instructions for dcos-ethereum can be found in the documentation, but the TL;DR is:

dcos package install ethereum

There are many configuration options, but we’ll only discuss the flags needed to create something like the above component diagram as a private network using the clique consensus algorithm. Following is an options.json example that will produce our desired Ethereum network.

{
"service": {
"name": "ethereum"
},
"geth": {
"network_id": 18,
"syncmode": "full",
"consensus_engine": "clique",
"clique_period": 15
},
"boot": {
"count": 1
},
"ethstats": {
"count": 1
},
"sealer": {
"count": 3,
"args": "--mine --minerthreads=1",
"rpcapi": "",
"create_accounts": true
},
"client": {
"count": 2,
"args": "--metrics",
"rpcapi": "debug,db,personal,eth,network,web3,net,miner",
"create_accounts": true
}
}

Here is a description for the less obvious fields:

  • service.name: Name of the scheduler task running in Marathon on DC/OS.
  • geth.network_id: Chain or Network ID, used in the generated genesis.json as well as the geth executable arguments.
  • geth.syncmode: A sync mode of full means that the entire state will be processed by each node.
  • geth.consensus_engine: Used in the generated genesis.json, more details above.
  • geth.clique_period: The number of seconds between blocks getting sealed.
  • sealer.create_accounts / client.create_accounts: Upon startup, the scheduler will create and distribute Ethereum accounts to each client and sealer node.

Once the options.json is saved, the package can be installed with:

dcos package install ethereum --options options.json

Using the Cluster

For complete CLI usage details, run the following:

dcos ethereum --help

Finding the Nodes

To find the IP addresses and ports that each task is running on, use the endpoints command:

dcos ethereum endpoints

The output should look something like this with default install options:

[
"geth-boot-p2p-port",
"geth-client-http-port",
"geth-client-p2p-port",
"geth-client-ws-port",
"geth-ethstats-http-port",
"geth-sealer-p2p-port"
]

These are the names of each of the endpoints exposed by the framework. The p2p endpoints are for peer to peer internal communication in the network. We are most interested in the ws and http endpoints. The following command will give us the internal IP and port for the client node’s http endpoint:

dcos ethereum endpoints geth-client-http-port

The output contains a JSON object with an address and DNS that are available inside the DC/OS cluster:

{
"address": [
"<client-node-ip1>:1035",
"<client-node-ip2>:1032"
],
"dns": [
"client-0-node.ethereum.autoip.dcos.thisdcos.directory:1035",
"client-1-node.ethereum.autoip.dcos.thisdcos.directory:1032"
]
}

Since it is JSON, we can parse it pretty easily and try connecting:

CLIENT_HTTP=$(dcos ethereum endpoints geth-client-http-port \
| jq -r '.address[0]')
REQ='{"jsonrpc":"2.0","method":"net_peerCount","params":[],"id":1}'curl -X POST -H 'Content-Type: application/json' \
--data ${REQ} http://${CLIENT_HTTP}

The previous curl command executes the net_peerCount RPC method, and the result should look something like this:

{
"jsonrpc": "2.0",
"id": 1,
"result": "0x4"
}

This indicates that the client node has four connected peers.

Connecting to the Console

Now that we have addresses for the client and sealer nodes, we can use those with tools such as geth attach or truffle console. Geth can be run from docker using this command: docker run -it ethereum/client-go:alltools-latest. Once inside the container, attach to the console of a client node with the following:

geth attach http://${CLIENT_HTTP}

In the console you can now run any of the available commands.

> eth.accounts
["<hex address>"]
> eth.getBalance("<hex address>")
9.04625697166532776746648320380374280103671755200316906558262375061821325312e+74
> eth.gasPrice
18000000000
> eth.mining
false

For more information on geth attach and the JSRE REPL console, visit github.com/ethereum/go-ethereum/wiki/JavaScript-Console.

Service Discovery and Load-balancing

The endpoints found in the last section are not ideal for external use because they can change if one of the sealer or client node tasks gets re-launched. A better solution would be to maintain a single static endpoint that changes the backend task as needed.

There are several options to achieve this goal. Here we will look at Traefik, which is a pluggable HTTP reverse proxy written in Go. I’ll skip the installation, but I will mention a few relevant configuration options for our Traefik instance:

The traefik.toml should at least contain these options for our example:

...
[entryPoints]
[entryPoints.gethClientHTTP]
address = ":8500"
[entryPoints.gethClientWS]
address = ":8501"
[entryPoints.gethStats]
address = ":8502"
[mesos]
endpoint = "zk://leader.mesos:2181/mesos"
watch = true
domain = "mesos.localhost"
exposedByDefault = false
ipSources = "host"

Once Traefik is up and running, it will poll the Mesos state looking for tasks with traefik.* configuration labels.

Framework Components with Traefik

To update the DC/OS Ethereum framework in-place with the necessary labels, create an updated options.json:

{
"ethstats": {
"labels": "traefik.enable:true,traefik.frontend.entryPoints:gethStats"
},
"client": {
"labels": "traefik.enable:true,
traefik.geth-client-http-port.frontend.entryPoints:gethClientHTTP,traefik.geth-client-ws-port.frontend.entryPoints:gethClientWS"
}
}

For more information on the configurable labels for the Mesos Traefik provider, visit their documentation at docs.traefik.io/configuration/backends/mesos.

Once we have the options.json, we can update with dcos ethereum update start --options=options.json.

We need to find the IP for the DC/OS agent running our Traefik instance. This will most often be the public agent in our cluster, and you can find it using the instructions on Mesosphere’s documentation: docs.mesosphere.com/latest/administering-clusters/locate-public-agent.

With Traefik and our updated framework running, we can now access a single address for the load-balanced client node(s):

geth attach ${DCOS_PUBLIC_AGENT}:8500
> net.peerCount
4

Metrics and Monitoring

The DC/OS Ethereum plugin will deploy an ethstats node by default, but it is also possible to gather metrics directly from each node in your blockchain as well.

Image captured from http://ethstats.net/

Using the debug_metrics method from the RPC API, various metrics and statistics can be gathered:

REQ='{"jsonrpc":"2.0","method":"debug_metrics","params":[true],"id":1}'curl -X POST -H 'Content-Type: application/json' \
--data ${REQ} http://${CLIENT_HTTP} | jq .result

The output is quite large, but I’ll include a portion of it here:

{
"chain": {
"inserts": {
"AvgRate01Min": 0.07260938302124689,
"AvgRate05Min": 0.10760446651086283,
"AvgRate15Min": 0.15616488892140085,
"MeanRate": 0.07027063637990597,
"Overall": 26,
"Percentiles": {
"20": 366511.4,
"5": 323385.3,
"50": 417500,
"80": 470857.4,
"95": 791375.5999999994
}
}
},
...
}

In our own DC/OS cluster, we use an analytics stack including Grafana, InfluxDB, and Telegraf. We created a Telegraf input plugin that will selectively gather Ethereum metrics for output to other services such as InfluxDB. You can find our fork with that functionality at github.com/iss-lab/telegraf. Our plugin hasn’t made it into the main Telegraf repository yet, but we do have an outstanding pull request.

If you use our plugin, configuration is fairly straightforward:

[[inputs.geth]]
servers = [
"http://<ip>:<port>"
]
metrics = [
"chain",
"db",
"discv5",
"eth",
"les",
"p2p",
"system",
"trie",
"txpool"
]

The array of metrics can be narrowed down to only the things that you are interested in with syntax like metrics = ["chain.inserts"].

Summary

We are excited about blockchain technology and DC/OS and further implementing the DC/OS Ethereum framework, allowing us to automate the deployment of an internal Ethereum cluster.

The field of blockchain infrastructure is evolving rapidly, and we hope that we can contribute more in this space. In the future we will be sharing more about what you can do with a private blockchain, such as automating the complete lifecycle of decentralized apps written in Solidity and using Truffle.

--

--

Technology Innovation Lab at ISS (Institutional Shareholder Services), the world’s leading provider of corporate governance and responsible investment solutions.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Drew Kerrigan

Infrastructure automation and software engineering with a focus on blockchain and AI/ML.