Cloudprober: gRPC Probes & Kubernetes
I’ve written a simple gRPC-based extension to Cloudprober. A probe ‘proxy’ is created by Cloudprober and this communicates with a probe over gRPC.
One challenge with deploying Cloudprober to Kubernetes is that Cloudprober spawns probe processes and communicates with them using stdin/stdout. This tight-coupling presents challenges when containerizing Cloudprober. The simplest solution is to containerize Cloudprober with a probe but this may be limiting.
After mistakenly detouring through an attempt to use pipes to bridge a containerized Cloudprober with a containerized probe, I realized a good solution would be to use gRPC and leverage the protobufs already defined by Cloudprober.
Setup
Please follow the Cloudprober Getting Started. Later in this post, you can deploy the solution to Kubernetes. I assume you’re able to deploy your own cluster.
Code
Let’s start with the probeserver.proto:
Cloudprober’s serverutils defines a Serve
function that takes ProbeRequest
and returns ProbeReplies
. This is the mechanism by which Cloudprober interacts with probes that it spawns (locally). We’ll revise that signature to reflect a gRPC server receiving ProbeRequest
messages and returning ProbeReply
messages.
NB This
proto
imports theproto
defined by Cloudprober’sserverutils
. You will need to have the Cloudprober source (or at least theserver.proto
) locally and it should be accessible on theproto_path
defined below. Also, retain the double directory ofprobeserver/probeserver/probeserver.proto
gRPC is a little ‘challenging’ in this regard.
To generate the Golang:
protoc \
--proto_path=${GOPATH}/src \
-I=probeserver/ \
probeserver/probeserver.proto \
--go_out=plugins=grpc:probeserver
For consistency, I recommend you create probeserver/client
and probeserver/server
directories too.
The client main.go
is here:
NB Replace
{YOUR_GITHUB_ACCOUNT}
with your GitHub account ID.
I’ve included lots of log.Print
statements here. Feel free to purge these. This code is drawn from the gRPC “Hello World” sample. If there are better ways to implement this, please let me know.
This code represents a general-purpose proxy to a Cloudprober probe and uses gRPC to communicate with the probe. It uses the code generated from the Protobuf to communicate with the server. PROBE_HOST
and PROBE_PORT
default to localhost
and 50051
. We’ll use these when we containerize the app and run it on Kubernetes.
The code takes the ProbeRequest
provided by Cloudprober and relays it to the remote gRPC server (also as a ProbeRequest
). It takes the ProbeReply
from the gRPC server creates a new ProbeReply
from it and passes the new ProbeReply
back to Cloudprober.
The server main.go
is here:
NB Replace
{YOUR_GITHUB_ACCOUNT}
with your GitHub account ID.
The server represents a specific implementation of a Probe called probe
. It uses the code generated from the Protobuf to communicate with the server. PROBE_PORT
defaults to 50051
. We’ll use this when we deploy the code to Kubernetes.
As if we were developing a regular (server) Probe for Cloudprober, the server implements a probe
. This probe sleeps for a random period of time and then returns this delay as the probe’s measure duration_ms
.
Unlike when developing a regular Probe for Cloudprober, rather than call probe
using serverutils.Serve
, we’ll instead use the gRPC service Probe
's Serve
method. Serve
as you may recall accepts a ProbeRequest
calls our probe (probe
) and returns the result as a ProbeReply
.
OK.
Local Testing
To run the client, we’ll need a cloudprober.local.cfg:
Place this file in the probeserver/client
directory — assuming you’ve installed and built Cloudprober per the instructions on its site — you can run the client:
cloudprober --config_file=./cloudprober.local.cfg --logtostderr
To run the probe (server) — from another terminal session —switch to the probeserver/server
directory and run:
go run main.go
To test PROBE_PORT
, prefix both commands with the same setting, e.g.
PROBE_PORT=9999 go run main.go
You’ll see many lines of output but — for the client — the key lines are of the form:
cloudprober probe=grpc_probe,dst= success=1 total=31 latency=...
cloudprober probe=grpc_probe,dst= duration_ms=541
These reflect the probe name grpc_probe
that was provide in the cloudprober.local.cfg
file, that the request succeeded (success
) and that the remote probe’s random sleep (in this case) was 541 milliseconds. Your values will differ.
For the server, you’ll see matching output:
[Serve] request=request_id:31 time_limit:2500 ; reply=<nil>
[Serve] Probe replies: duration_ms 541
[Serve] request=request_id:31 time_limit:2500 ; reply=request_id:31 payload:"duration_ms 541"
While Cloudprober
is running, it exposes a Prometheus endpoint on 9313
. You can observe this by curl’ing or browsing localhost:9313/metrics
. Among other measures (!), you should also see the “duration_ms” metric represented:
#TYPE success counter
success{ptype="external",probe="grpc_probe",dst=""} 31 1523142921115
#TYPE total counter
total{ptype="external",probe="grpc_probe",dst=""} 32 1523142921115
#TYPE latency counter
latency{ptype="external",probe="grpc_probe",dst=""} 2880611.175 1523142921115
#TYPE duration_ms counter
duration_ms{ptype="external",probe="grpc_probe",dst=""} 541 1523142921115
Containerize
To deploy to Kubernetes, we’ll need to containerize Cloudprober with the gRPC proxy and a probe. We’ll use the test probe above for the latter.
There are various way to create optimal Golang containers. I like Nick’s approach, build a static binary and FROM scratch
. I’m liking and using dumb-init consistently too. Don’t forget if you follow this path to grab your machine’s ca-certificates.crt
too.
Here’s the client’s Dockerfile:
NB the Dockerfile references a different
cloudprober.docker.cfg
to compensate for the use of the binary and changes in the path. Here’s that file:
I’m assuming you’ve created containers called gcr.io/${PROJECT}/cloudprober:grpc-client
and gcr.io/${PROJECT}/cloudprober:grpc-server
.
My apologies for swapping “_” and “-” on you but, for Kubernetes naming, let’s prefer “-” for container stuff.
Container Testing
Measure twice, cut once…
Let’s retest now that both client and server are containerized. To be sure, let’s use a different port:
PROBE_PORT=7777
For the client:
docker run \
--interactive \
--rm \
--net=host \
--env=PROBE_PORT=${PROBE_PORT} \
gcr.io/${PROJECT}/cloudprober:grpc-client
For the server:
docker run \
--interactive \
--tty \
--rm \
--publish=${PROBE_PORT}:${PROBE_PORT} \
--env=PROBE_PORT=${PROBE_PORT} \
gcr.io/${PROJECT}/cloudprober:grpc-server
All being well, you should see a similar result as before. It should work. Because the client --net=host
you should be able to curl the Prometheus endpoint for the client as before.
[Container] Registry
I’m going to assume your containers are called cloudprober:grpc_client
and cloudprober:grpc_server
. If you’re using Kubernetes Engine, I recommend you push these to Container Register in the same project as your cluster. Regardless, push them to a rep that your cluster can access them from.
Kubernetes, Kubernetes, Kubernetes
The probe may be deployed either as a sidecar (alongside the Cloudprober) or independently with the probe running as a service.
Let’s do both:
Sidecar
Both containers in a single Pod. For simplicity, let’s leave the connection to the default PROBE_PORT
too. For giggles, you can work out how to use a different PROBE_PORT
in the sidecar.
We’ll need a deployment.yaml and — I’ll avoid my customary laziness — and we’ll deploy to its own namespace sidecar
.
kubectl create namespace cloudprober-sidecar
Here’s a deployment file you’ll need to replace ${PROJECT}:
and then:
kubectl apply --filename=deployment.sidecar.yaml
You should have a deployment
, and a pod
with 2 containers grpc-client
and grpc-server
. Here’s the Kubernetes dashie:
And Cloud Console:
And, you can check the logs:
SIDECAR_POD=$(\
kubectl get pods \
--namespace=cloudprober-sidecar \
--output=jsonpath="{.items[0].metadata.name}")kubectl logs pods/${SIDECAR_POD} grpc-client \
--namespace=cloudprober-sidecarkubectl logs pods/${SIDECAR_POD} grpc-server \
--namespace=cloudprober-sidecar
And you can check the Prometheus metrics endpoint:
NODE_HOST=$(\
kubectl get nodes \
--output=jsonpath="{.items[0].metadata.name}")SIDECAR_PORT=$(\
kubectl get services/sidecar \
--namespace=cloudprober-sidecar \
--output=jsonpath="{.spec.ports[0].nodePort}")gcloud compute ssh ${NODE_HOST} \
--project=${PROJECT} \
--ssh-flag="-L ${SIDECAR_PORT}:localhost:${SIDECAR_PORT}"
NB You may notice that I did — in fact — use a different
PROBE_PORT
in the sidecar ;-)
You can find the metrics endpoint on localhost:${SIDECAR_PORT}/metrics
.
Service
kubectl create namespace cloudprober-service
Here’s a deployment file, you’ll need to replace ${PROJECT}:
Deploy using:
kubectl apply --filename=deployment.service.yaml
This is more complex. It deploys the grpc-client
and grpc-server
separately. Each with an associated service. The grpc-client
accesses grpc-server
through the Kubernetes DNS name grpc-server.cloudprober-service.svc.default.cluster.local
. Once again, we’re binding gRPC to port 8888
rather than its default.
Here’s the Kubernetes dashie:
Let’s use Cloud Logging this time:
And, you’ll see a trick which is to use the Cloud Console to draft a filter that can then be used with the command-line:
FILTER="resource.type=\"container\" "\
"resource.labels.cluster_name=\"${CLUSTER}\" "\
"resource.labels.namespace_id=\"cloudprober-service\" "\
"logName=\"projects/${PROJECT}/logs/grpc-client\" "\
"textPayload:\"labels=ptype=external,probe=grpc_client,dst=\""gcloud logging read "${FILTER}" \
--project=$PROJECT \
--format=json \
| jq .[].textPayload
Using NODE_HOST
and SIDECAR_PORT
from above, you can then add SERVICE_PORT
using:
SERVICE_PORT=$(\
kubectl get services/grpc-client \
--namespace=cloudprober-service \
--output=jsonpath="{.spec.ports[0].nodePort}")gcloud compute ssh ${NODE_HOST} \
--project=${PROJECT} \
--ssh-flag="-L ${SIDECAR_PORT}:localhost:${SIDECAR_PORT}" \
--ssh-flag="-L ${SERVICE_PORT}:localhost:${SERVICE_PORT}"
and access this service’s metrics endpoint via localhost:${SERVICE_PORT}
.
Conclusion
Cloudprober is neat. In truth — while I hope it is useful — it’s unclear to me whether any of this is useful to anyone else. If it were useful, I think the correct next step would be to fold the proxy code into Cloudprober itself so that another external_probe
mode
is perhaps PROXY
and something similar to the gRPC client code outlined here be included.
Feedback and suggestions always welcome.
That’s all!