Cloudprober: gRPC Probes & Kubernetes

I’ve written a simple gRPC-based extension to Cloudprober. A probe ‘proxy’ is created by Cloudprober and this communicates with a probe over gRPC.

One challenge with deploying Cloudprober to Kubernetes is that Cloudprober spawns probe processes and communicates with them using stdin/stdout. This tight-coupling presents challenges when containerizing Cloudprober. The simplest solution is to containerize Cloudprober with a probe but this may be limiting.

After mistakenly detouring through an attempt to use pipes to bridge a containerized Cloudprober with a containerized probe, I realized a good solution would be to use gRPC and leverage the protobufs already defined by Cloudprober.

Setup

Please follow the Cloudprober Getting Started. Later in this post, you can deploy the solution to Kubernetes. I assume you’re able to deploy your own cluster.

Code

Let’s start with the probeserver.proto:

Cloudprober’s serverutils defines a Serve function that takes ProbeRequest and returns ProbeReplies. This is the mechanism by which Cloudprober interacts with probes that it spawns (locally). We’ll revise that signature to reflect a gRPC server receiving ProbeRequest messages and returning ProbeReply messages.

NB This proto imports the proto defined by Cloudprober’s serverutils. You will need to have the Cloudprober source (or at least the server.proto) locally and it should be accessible on the proto_path defined below. Also, retain the double directory of probeserver/probeserver/probeserver.proto gRPC is a little ‘challenging’ in this regard.

To generate the Golang:

protoc \
--proto_path=${GOPATH}/src \
-I=probeserver/ \
probeserver/probeserver.proto \
--go_out=plugins=grpc:probeserver

For consistency, I recommend you create probeserver/client and probeserver/server directories too.

The client main.go is here:

NB Replace {YOUR_GITHUB_ACCOUNT} with your GitHub account ID.

I’ve included lots of log.Print statements here. Feel free to purge these. This code is drawn from the gRPC “Hello World” sample. If there are better ways to implement this, please let me know.

This code represents a general-purpose proxy to a Cloudprober probe and uses gRPC to communicate with the probe. It uses the code generated from the Protobuf to communicate with the server. PROBE_HOST and PROBE_PORT default to localhost and 50051. We’ll use these when we containerize the app and run it on Kubernetes.

The code takes the ProbeRequest provided by Cloudprober and relays it to the remote gRPC server (also as a ProbeRequest). It takes the ProbeReply from the gRPC server creates a new ProbeReply from it and passes the new ProbeReply back to Cloudprober.

The server main.go is here:

NB Replace {YOUR_GITHUB_ACCOUNT} with your GitHub account ID.

The server represents a specific implementation of a Probe called probe. It uses the code generated from the Protobuf to communicate with the server. PROBE_PORT defaults to 50051. We’ll use this when we deploy the code to Kubernetes.

As if we were developing a regular (server) Probe for Cloudprober, the server implements a probe. This probe sleeps for a random period of time and then returns this delay as the probe’s measure duration_ms.

Unlike when developing a regular Probe for Cloudprober, rather than call probe using serverutils.Serve, we’ll instead use the gRPC service Probe's Serve method. Serve as you may recall accepts a ProbeRequest calls our probe (probe) and returns the result as a ProbeReply.

OK.

Local Testing

To run the client, we’ll need a cloudprober.local.cfg:

Place this file in the probeserver/client directory — assuming you’ve installed and built Cloudprober per the instructions on its site — you can run the client:

cloudprober --config_file=./cloudprober.local.cfg --logtostderr

To run the probe (server) — from another terminal session —switch to the probeserver/server directory and run:

go run main.go

To test PROBE_PORT, prefix both commands with the same setting, e.g.

PROBE_PORT=9999 go run main.go

You’ll see many lines of output but — for the client — the key lines are of the form:

cloudprober probe=grpc_probe,dst= success=1 total=31 latency=...
cloudprober probe=grpc_probe,dst= duration_ms=541

These reflect the probe name grpc_probe that was provide in the cloudprober.local.cfg file, that the request succeeded (success) and that the remote probe’s random sleep (in this case) was 541 milliseconds. Your values will differ.

For the server, you’ll see matching output:

[Serve] request=request_id:31 time_limit:2500 ; reply=<nil>
[Serve] Probe replies: duration_ms 541
[Serve] request=request_id:31 time_limit:2500 ; reply=request_id:31 payload:"duration_ms 541"

While Cloudprober is running, it exposes a Prometheus endpoint on 9313. You can observe this by curl’ing or browsing localhost:9313/metrics. Among other measures (!), you should also see the “duration_ms” metric represented:

#TYPE success counter
success{ptype="external",probe="grpc_probe",dst=""} 31 1523142921115
#TYPE total counter
total{ptype="external",probe="grpc_probe",dst=""} 32 1523142921115
#TYPE latency counter
latency{ptype="external",probe="grpc_probe",dst=""} 2880611.175 1523142921115
#TYPE duration_ms counter
duration_ms{ptype="external",probe="grpc_probe",dst=""} 541 1523142921115

Containerize

To deploy to Kubernetes, we’ll need to containerize Cloudprober with the gRPC proxy and a probe. We’ll use the test probe above for the latter.

There are various way to create optimal Golang containers. I like Nick’s approach, build a static binary and FROM scratch. I’m liking and using dumb-init consistently too. Don’t forget if you follow this path to grab your machine’s ca-certificates.crt too.

Here’s the client’s Dockerfile:

NB the Dockerfile references a different cloudprober.docker.cfg to compensate for the use of the binary and changes in the path. Here’s that file:

I’m assuming you’ve created containers called gcr.io/${PROJECT}/cloudprober:grpc-client and gcr.io/${PROJECT}/cloudprober:grpc-server.

My apologies for swapping “_” and “-” on you but, for Kubernetes naming, let’s prefer “-” for container stuff.

Container Testing

Measure twice, cut once…

Let’s retest now that both client and server are containerized. To be sure, let’s use a different port:

PROBE_PORT=7777

For the client:

docker run \
--interactive \
--rm \
--net=host \
--env=PROBE_PORT=${PROBE_PORT} \
gcr.io/${PROJECT}/cloudprober:grpc-client

For the server:

docker run \ 
--interactive \
--tty \
--rm \
--publish=${PROBE_PORT}:${PROBE_PORT} \
--env=PROBE_PORT=${PROBE_PORT} \
gcr.io/${PROJECT}/cloudprober:grpc-server

All being well, you should see a similar result as before. It should work. Because the client --net=host you should be able to curl the Prometheus endpoint for the client as before.

[Container] Registry

I’m going to assume your containers are called cloudprober:grpc_client and cloudprober:grpc_server. If you’re using Kubernetes Engine, I recommend you push these to Container Register in the same project as your cluster. Regardless, push them to a rep that your cluster can access them from.

Kubernetes, Kubernetes, Kubernetes

The probe may be deployed either as a sidecar (alongside the Cloudprober) or independently with the probe running as a service.

Let’s do both:

Sidecar

Both containers in a single Pod. For simplicity, let’s leave the connection to the default PROBE_PORT too. For giggles, you can work out how to use a different PROBE_PORT in the sidecar.

We’ll need a deployment.yaml and — I’ll avoid my customary laziness — and we’ll deploy to its own namespace sidecar.

kubectl create namespace cloudprober-sidecar

Here’s a deployment file you’ll need to replace ${PROJECT}:

and then:

kubectl apply --filename=deployment.sidecar.yaml

You should have a deployment, and a pod with 2 containers grpc-client and grpc-server. Here’s the Kubernetes dashie:

Kubernetes Dashboard

And Cloud Console:

Cloud Console

And, you can check the logs:

SIDECAR_POD=$(\
kubectl get pods \
--namespace=cloudprober-sidecar \
--output=jsonpath="{.items[0].metadata.name}")
kubectl logs pods/${SIDECAR_POD} grpc-client \
--namespace=cloudprober-sidecar
kubectl logs pods/${SIDECAR_POD} grpc-server \
--namespace=cloudprober-sidecar

And you can check the Prometheus metrics endpoint:

NODE_HOST=$(\
kubectl get nodes \
--output=jsonpath="{.items[0].metadata.name}")
SIDECAR_PORT=$(\
kubectl get services/sidecar \
--namespace=cloudprober-sidecar \
--output=jsonpath="{.spec.ports[0].nodePort}")
gcloud compute ssh ${NODE_HOST} \
--project=${PROJECT} \
--ssh-flag="-L ${SIDECAR_PORT}:localhost:${SIDECAR_PORT}"
NB You may notice that I did — in fact — use a different PROBE_PORT in the sidecar ;-)

You can find the metrics endpoint on localhost:${SIDECAR_PORT}/metrics.

Service

kubectl create namespace cloudprober-service

Here’s a deployment file, you’ll need to replace ${PROJECT}:

Deploy using:

kubectl apply --filename=deployment.service.yaml

This is more complex. It deploys the grpc-client and grpc-server separately. Each with an associated service. The grpc-client accesses grpc-server through the Kubernetes DNS name grpc-server.cloudprober-service.svc.default.cluster.local. Once again, we’re binding gRPC to port 8888 rather than its default.

Here’s the Kubernetes dashie:

Kubernetes Dashboard
Cloud Console

Let’s use Cloud Logging this time:

Google Cloud Console: Logging

And, you’ll see a trick which is to use the Cloud Console to draft a filter that can then be used with the command-line:

FILTER="resource.type=\"container\" "\
"resource.labels.cluster_name=\"${CLUSTER}\" "\
"resource.labels.namespace_id=\"cloudprober-service\" "\
"logName=\"projects/${PROJECT}/logs/grpc-client\" "\
"textPayload:\"labels=ptype=external,probe=grpc_client,dst=\""
gcloud logging read "${FILTER}" \
--project=$PROJECT \
--format=json \
| jq .[].textPayload

Using NODE_HOST and SIDECAR_PORT from above, you can then add SERVICE_PORT using:

SERVICE_PORT=$(\
kubectl get services/grpc-client \
--namespace=cloudprober-service \
--output=jsonpath="{.spec.ports[0].nodePort}")
gcloud compute ssh ${NODE_HOST} \
--project=${PROJECT} \
--ssh-flag="-L ${SIDECAR_PORT}:localhost:${SIDECAR_PORT}" \
--ssh-flag="-L ${SERVICE_PORT}:localhost:${SERVICE_PORT}"

and access this service’s metrics endpoint via localhost:${SERVICE_PORT}.

Conclusion

Cloudprober is neat. In truth — while I hope it is useful — it’s unclear to me whether any of this is useful to anyone else. If it were useful, I think the correct next step would be to fold the proxy code into Cloudprober itself so that another external_probe mode is perhaps PROXY and something similar to the gRPC client code outlined here be included.

Feedback and suggestions always welcome.

That’s all!