Helm Chart for Fabric
Programming Notes
As I mentioned in a story last week, I’ve been unsuccessful attempting to create a Helm Chart for Hyperledger Fabric for Kubernetes. Various approaches are documented on GitHub (v1.2, v1.1) but neither of these work without some manual hackery. This story is an attempt to summarize my programming notes.
Helm?
Helm has become the go-to choice for Kubernetes application deployments and is the deployment choice for many many Kubernetes applications. IIUC the next version of Helm will include significant changes including a change from the cluster-side management. Because of requirements building Cloud Marketplace deployments for Kubernetes, I used Helm’s client-only and helm template approach almost from the start.
I’ve used Google’s Cloud Deployment Manager, bash, Jsonnet, and Helm to deploy Kubernetes solutions and I’m unconvinced by the distinctive value that Helm currently provides to application deployment developers. Because it is a de facto standard, there are benefits for application end-users in employing it.
Prior Art
My thanks go to various folks at IBM who graciously helped in my adventure. I began this project with a Helm Chart for Fabric that IBM developed for IBM Container Service (link). Subsequently IBMers Gari and Yacov have been helpeful in educating me on Fabric concepts and providing guidance.
ReadWriteMany
The Helm Charts I drafted for Fabric assume there’s a ReadWriteMany volume available to the cluster. Google Persistent Disk does not support ReadWriteMany volumes (directly).
Instead I used a Kubernetes sample that shows how to run NFS atop Persistent Disk. The NFS server is configured as a ReadWriteMany Persistent Volume and Persistent Volume Claims may be made against it.
For the Fabric Helm Chart, there is a shared claim used to store the output of crypto-config, the genesis block, channel transactions, and for semaphore-like files that are used by the deployment to track state.
This solutions works very well. It is kinda hidden but it’s a reliable, familiar and cheap way to get ReadWriteMany with Kubernetes Engine.
Labeling
Helm has a worthwhile set of best-practices including for resource labeling. As I document this, I realize that I’ve either drifted from or not accurately implemented the current guidance (apologies).
I’ve tried to follow consistently the following approach:
app.kubernetes.io/name: {{ include "hyperledger-fabric.chart" . }}
app.kubernetes.io/version: {{ .Chart.Version }}
app: {{ include "hyperledger-fabric.name" . }}
chart: {{ include "hyperledger-fabric.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
component: peer
org: {{ $orgName }}
peer: {{ $peerID | quote }}Component reflects the types of Fabric components: peer, orderer, ca etc. It was prudent to also account for org names and peer ID using labels too. There’s probably a “too many” point with labels but this list is far from that. Generally, you should *not* use Kubernetes names as a way to reference resources mirroring Kubernetes own approach to e.g. selecting Pods by their labels and for similar reasons: dynamism and multi-dimensionality.
ConfigMaps
Fabric’s configuration files crypto-config.yaml and configtx.yaml are natural fits for Kubernetes ConfigMaps and both are reflected as ConfigMaps in the the repositories (link, link).
One challenge I faced is in wanting to be able to parse these files to extract values during deployment. For example, crypto-config.yaml defines the orderer’s name, the orgs and their names and the initial number of peers and users. These values are all necessary to construct an accurate network.
However, while ConfigMaps may be used to host a set of key-value pairs, ConfigMaps don’t permit any hierarchy of values; if values are themselves key-values pairs, this is not directly accessible by way of ConfigMaps. So, while it’s trivial to encode a YAML file as a ConfigMap value, it’s not possible to reflect arbitrary YAML files as ConfigMaps.
Unlike ConfigMaps, Helm’s values.yaml may, of course, reflect arbitrary YAML files and any value in the file is directly accessible to Helm templates. For this reason, I manually reflected a subset of the configtx.yaml and crypto-config.yaml file contents in values.yaml so that I could access e.g. the number of peers per org directly. See link.
This permits, for example: iteration over the peer orgs defined in crypto-config.yaml in order to create a resource :
{{- range $i, $org := .Values.cryptoconfig.PeerOrgs }}
{{- $orgName := $org.Name | lower }}
{{- end }}Issue: How to access crypto-config.yaml and configtx.yaml more directly from Helm?
Issue: How are discrepancies between crypto-config.yaml and configtx.yaml seemingly (are they?) shared keys reconciled? Why the partitioning?
Issue: Fabric supports YAML anchors & references but these are not supported in values.yaml and much be flattened (or excluded).
Bootstrapping
Link.
There’s some preliminary work that needs to be done before the provisioning of the network may begin. These steps are (fortunately) a linear set of steps but for which successful completion must be recorded. Completion is used asynchronously to block other steps from starting: e.g. it’s not possible to do much of anything (start a peer) until cryptogen has been run.
Neither Kubernetes nor Helm includes a mechanism to orchestrate deployments that require dependencies and branching. This fact led me to an epiphany of disappointment with Helm. I realized that its client-only functionality in particular and (perhaps not ironically) in the name of the helm template command, makes Helm very much a templating tool rather than a deployment tool.
In summary, if Helm is just templating, I’d prefer to use Jsonnet. While I’m a fan of Golang, I’m less of a fan of Golang templating (even when writing it alongside Go) and, for (JSON sic.) templating, prefer the simplicity Jsonnet.
With Helm, I’m generating a bunch of Kubernetes manifests, lobbing them at the cluster and letting it (mostly do very little to) coordinate the creation of resources.
Morever, because neither Helm nor Kubernetes support dependencies or branching, the developer must resort to obfuscated ways to achieve this. In my case, taking a lead from IBM’s solution’s use of semaphore files. This is a good solution to the problem but, this intent is non-obvious without digging in to manifest files and wondering why files named cryptogen_complete are being touched.
Because the bootstrap steps are one-and-done, I used a Kubernetes Job resource to represent it. I use initContainers frequently across this and other manifests. initContainers, as the name suggests, are preparatory containers for the main containers event. Interestingly though, initContainers are run sequentially in the order they are defined. This provides a convenient way to break up a set of linear steps into something that’s easy to observe from Kubernetes monitoring.
For the bootstrapping we:
cryptogengenerateconfigtxgen(genesis)configtxgen(channel)- ForEach anchor
configtxgen(anchor peers transactions)
And so this translates into (pseudo-templating):
initContainers:# 1st step
- name: cryptogen
image: cryptogen
args:
- generate# 2nd step
- name: configtxgen-genesis
image: configtxgen
args:
- ...
- -outputBlock# 3rd step
- name: configtxgen-channel
image: configtxgen
args:
- ...
- outputCreateChannelTx# 4th step
{{- range $org := .Values.configtx.Organizations }}
- name: anchor-{{ $org.ID | lower }}
image: configtxgen
args:
- ...
- -outputAnchorPeersUpdate
{{- end }}containers:
...
Several things to note. The actual containers in this Job are mostly output steps and the action’s in the initContainers. I built containers for cryptogen and configtxgen. These binaries are built for Debian|Ubuntu (using libc) and I wanted to use a lighter-weight runtime environment, like Alpine. Putting these binaries into containers added some clarity to the flow and is more consistent with the other Fabric binaries that are available as container images.
The containers perform configtxgen inspectBlock and inspectChannelCreateTx (review) commands. Also one container touches /shared/bootstrapped to act as the semaphore that bootstrapping completed successfully.
Channel Create & Peer Join
Channel creation is another Kubernetes Job. As with other Fabric resources, it blocks on the successful completion of Bootstrapping using an initContainer that checks for the creation of the semamphore file. Initially the Job also blocked on the availability of anchor peers using peer node status but Yacov@IBM pointed out that this was redundant and it was also causing deployment issues. Once the channel is successfully created, another file semaphore is created that signals for the Peer Join commands to proceed.
In this Chart, every Peer then joins the Channel. So, the Deployment is similar to the Peer create in that it iterates over the Orgs and then dynamically each of the Peers defined in the Org and runs the correctly configured peer channel join command.
Then, once per Org, the Channel must be updated with the Org’s Anchor Peer.
CA, CLI, Orderer
These resources are straightforward to deploy. They employ a simple pattern that is used across the Fabric node types. A single Pod (albeit managed by a Deployment) is fronted by a Service. The Service provides a DNS name for the Fabric node and ports, e.g. gRPC on 7050.
The CLI is a useful tool for debugging, installing and instantiating chaincode. I templatized commands to get myself quickly back to the instantiate command when I was having problems resolving why this command failed.
Getting onto the CLI:
kubectl exec \
--stdin \
--tty \
$(\
kubectl get pods \
--selector=component=cli,org=org1 \
--output=jsonpath="{ .items[0].metadata.name }" \
--namespace=${NAMESPACE} \
--context=${CONTEXT} \
) \
--container=cli \
--namespace=${NAMESPACE} \
--context=${CONTEXT} \
-- bashThen:
peer chaincode install \
--name=${NAME} \
--version=${VERSION} \
--path=github.com/chaincode/example02/go/2018-08-24 [chaincodeCmd] checkChaincodeCmdParams -> INFO 001 Using default escc
2018-08-24 [chaincodeCmd] checkChaincodeCmdParams -> INFO 002 Using default vscc
2018-08-24 [main] main -> INFO 003 Exiting.....
Which can be confirmed with:
peer chaincode list \
--channelID=$CHANNEL_NAME \
--installedGet installed chaincodes on peer:
Name: ex02, Version: 2.0, Path: github.com/chaincode/example02/go/, Id: 33620f48ee049ffbb7f763f77a3bde54b768ebba9d502532c9bbb98e66fdba3e
2018-08-24 [main] main -> INFO 001 Exiting.....
The challenge was with the instantiation:
peer chaincode instantiate \
--orderer=${RELEASE_NAME}-hyperledger-fabric-orderer:7050 \
--cafile=/shared/.../tlsca.example.com-cert.pem \
--channelID=$CHANNEL_NAME \
--name=${NAME} \
--version=${VERSION} \
--ctor='{"Args":["init","a", "100", "b","200"]}' \
--policy="OR ('Org1MSP.peer','Org2MSP.peer')"Which would value consistently with:
Error: Error endorsing chaincode: rpc error: code = Unknown desc = timeout expired while starting chaincode ex02:2.0(networkid:dev,peerid:org1-peer0,tx:82221c06970c59e6a745c04515446d6776aa2f8899fc1d2a808954c81db07ee1)Thanks to Yacov@IBM, the issue was determined to be the challenge with Fabric using docker-in-docker that I summarize briefly in “Peer(s)” below and more completely in my summary story last week [link].
Peer(s)
Peer creation blocks on the successful completion of the bootstrapping step. Several other Deployment steps similarly block on the bootstrapping. To try to be explicit about this fact, I use initContainers and they simply:
- name: await-bootstrapped
image: busybox
imagePullPolicy: IfNotPresent
command:
- ash
- -c
- |
while [ ! -f /shared/bootstrapped ]; do
echo Awaiting /shared/bootstrapped
sleep 15s
doneI wanted to provision peers based on the intent defined in crypto- config.yaml. As I mentioned above, I copied salient chunks from both crypto-config.yaml and configtx.yaml to achieve this purpose. Thus the peer Deployment comprises an iteration over the [Peer] Orgs and then an iteration for each of the peers:
{{- range $i, $org := .Values.cryptoconfig.PeerOrgs }}
{{- $orgName := $org.Name | lower }}
{{- range $j, $peerID := until ( $org.Template.Count | int) }}
{{- $orgFullName := printf "%s.example.com" $orgName }}
{{- $peerName := printf "peer%d" $peerID }}
{{- $peerFullName := printf "%s.%s" $peerName $orgFullName }}
...
{{- end }}
{{- end }}Within the inner loop, a Deployment is created for each Peer and a companion Service that exposes the Peer (Pod) to other entities in the Fabric network.
It’s straightforward to take an Org name (e.g. the prototypical Org1) and a peer name (e.g. the prototypical Peer0) and maps these into e.g. x-hyperledger-fabric-org1-peer0 and e.g. org1.example.com. I’ll defer this discussion to the “Naming” section (see below).
Peers also mount docker.sock. When chaincode instantiate commands are called, the peer, takes the install chaincode and uses docker-in-docker to create a Docker image, attach and then start it:
IMAGE=$(\
docker images --format="{{ .Repository }}" \
| grep dev-org1-peer0-example02-2.0\
)docker events --filter=image=${IMAGE}2018-08-22 container create 132bbe(image=dev-org1-peer0-ex02-2.0)
2018-08-22 container attach 132bbe(image=dev-org1-peer0-ex02-2.0)
2018-08-22 container start 132bbe(image=dev-org1-peer0-ex02-2.0)
However, this results in a container running in the context of Docker Engine on the host machine (in this case a Kubernetes Node). If the container tries to talk to e.g. x-hyperledger-fabric-org1-peer0, it fails. This name is a Kubernetes DNS name and Kubernetes names aren’t resolvable by the outside of the cluster. This is one of the outstanding problems that I described in my summary story last week.
Debugging: How to find the Node on which a Pod is running?
PROJECT=[[YOUR-PROJECT]]
NAMESPACE=[[YOUR-NAMESPACE]] # Optional
CONTEXT=[[YOUR-CONTEXT]] # Optional
SELECTOR="component=peer,org=org1,peer=0"gcloud compute ssh $(\
kubectl get pods \
--selector=${SELECTOR} \
--output=jsonpath="{.items[0].spec.nodeName }" \
--namespace=${NAMESPACE} \
--context=${CONTEXT}) \
--project=${PROJECT}
NB This combines a
kubectlcommand that gets a Pod’s Node’s name which is then used in conjunction withgcloudto ssh into the instance.
Issue: I continue to have a very unclear perspective on which environment variables are required, which are optional and, for which the defaults are acceptable for the otherwise straightforward peer node start command.
Feature Request: I’d like to see the documentation extended to include the list of environment variables associated with each command.
Naming
Naming remains a challenge. Reconciling the Helm|Kubernetes naming (best practices) and Fabric’s naming, remains unresolved.
Ports
Initially, naively, I generated NodePorts for every Service. If not specified with --type=NodePort services, an available port is assigned. I had initially, not thought the process through and tried to calculate NodePorts for the Services and spread these across the port space.
This was a bad idea.
Not only did it create the possibility that I’d need to juggle port space to ensure that the necessary Orgs, Peers etc. had sufficient ports for their needs but, more importantly, I forgot one tenet of Kubernetes. Namely, that I can always look up the NodePorts that are assigned to a Service:
kubectl get services \
--selector=${SELECTOR} \
--output=jsonpath='{.spec.ports[?(@.name=="grpc")].nodePort}" \
--namespace=${NAMESPACE} \
--context=${CONTEXT}NB In the above, JSONPath is used to filter the array of
portsfor the one namedgrpc, itsnodePortvalue is the result.
But, on reflection, I realized that the majority of Services did not need to be exposed beyond the cluster in the prototype and, for this reason, I was able to drop the NodePorts entirely and use ClusterIPs.
When I need to expose Fabric nodes externally, I will leave the cluster to auto-assign NodePorts and then use queries like the above to determine a specific resource’s ports.
Logs
Logging is super important and, being able to quickly grab logs from any Pod|container is easy with Kubernetes Engine. Once again, I was repeatedly reviewing logs during debugging and benefitted from templating the command:
PROJECT=
CLUSTER=
REGION=
NAMESPACE=
CONTEXT=COMPONENT="peer"
ORG="org1"
PEER="2"POD=$(\
kubectl get pods \
--selector=component=${COMPONENT},org=${ORG},peer=${PEER} \
--namespace=${NAMESPACE} \
--context=${CONTEXT} \
--output=jsonpath="{.items[].metadata.name}"\
) && echo ${POD}AFTER=$(date --rfc-3339=s --date="1 hour ago" | sed "s| |T|")
BEFORE=$(date --rfc-3339=s | sed "s| |T|")FILTER="resource.type=\"k8s_container\" "\
"resource.labels.location=\"${REGION}\" "\
"resource.labels.cluster_name=\"${CLUSTER}\" "\
"resource.labels.pod_name=\"${POD}\" "\
"timestamp>=\"${AFTER}\" "\
"timestamp<=\"${BEFORE}\""gcloud logging read "${FILTER}" \
--project=${PROJECT} \
--format=json \
--order=asc \
| jq --raw-output '.[].textPayload | rtrimstr("\n")'
So, phew! But it’s helpful, I assure you ;-)
The top half merely defines constants. Once again, Kubernetes extensive use of labels, really helps in filtering resources. Here I was filtering through various Peer nodes. kubectl is used to identify the runtime name of the Pod that is of interest.
This is then incorporated into the lengthy filter string provided to gcloud logging. Essentially, we provide the cluster’s spec (name, region), the name of the Pod and, in this case, define timestamps covering the last hour. This could be replaced with the --freshness flag but that only supports descending time order (which is less useful).
Finally, we put everything together and pull the logs based on the filter in JSON that is then piped into jq, the textPayload is extracted and we trim the superfluous newlines included (by Fabric?) in the logs:
[nodeCmd] serve -> INFO 001 Starting peer:
Version: 1.1.0
Go version: go1.9.2
OS/Arch: linux/amd64
Experimental features: false
Chaincode:
Base Image Version: 0.4.6
Base Docker Namespace: hyperledger
Base Docker Label: org.hyperledger.fabric
Docker Namespace: hyperledger
[ledgermgmt] initialize -> INFO 002 Initializing ledger mgmt
[kvledger] NewProvider -> INFO 003 Initializing ledger provider
[kvledger] NewProvider -> INFO 004 ledger provider Initialized
[ledgermgmt] initialize -> INFO 005 ledger mgmt initialized
[peer] func1 -> INFO 006 Auto-detected peer address: 10.0.2.8:7051
[peer] func1 -> INFO 007 Returning x-hyperledger-fabric-org1-peer0:7051
[peer] func1 -> INFO 008 Auto-detected peer address: 10.0.2.8:7051
[peer] func1 -> INFO 009 Returning x-hyperledger-fabric-org1-peer0:7051
[eventhub_producer] start -> INFO 00a\ Event processor startedConclusion
I’ll continue to tweak these notes with additional information as I review it in my programming notes. I hope this is helpful|insightful for others.
That’s all!
