Mega8s: A Complex Kubernetes Cluster

Helping repro a customer’s issue required creating a Kubernetes cluster larger than I’ve created before with ~400 services with Network Load-Balancers. This required a raft of increased Quotas

Quotas

CPUs: 96
IPs: 96
PD: 9182 (GB)
Firewalls: 500
Forwarding Rules: 500
Target Pools: 500

NB My determination of the CPU and IP quotas was far in excess of what was actually needed. An earlier attempt at this record provisioned many more CPUs than were needed in the final run.

I created a Regional Cluster of preemptible 1x4s in my preferred us-west1.

Status

This is after the fact; once I’d deployed 400 services and these services had provisioned 400 Network LBs:

kubectl get deployments \
--namespace=$NAMESPACE \
--output=name \
| wc --lines
400
kubectl get services \
--namespace=$NAMESPACE \
--output=name \
| wc --lines
400
kubectl get pods \
--namespace=$NAMESPACE \
--output=name \
| wc --lines
400

Each of the services has --type=LoadBalancer and it takes some time for Kubernetes to program all of these:

gcloud compute forwarding-rules list \
--format="value(name)" \
--project=$PROJECT \
| wc --lines
401
gcloud compute instances list \
--format="value(name)" \
--project=$PROJECT \
| wc --lines
4
NB “401” — ??? see below for the explanation

There are 4 (!) nodes — each running 100 pods (one pod/service) which is quite impressive even though each pods is tiny (2.5MB)

NB It’s difficult to show 400 services… I assure you that the preceding services exist too.

How?

Bash-fu and my Bash isn’t excellent ;-)

See below for creating the container(s) for the gRPC service and for the Cloud Endpoints deployment.

Assuming you have a container representing the gRPC server:

gcr.io/${PROJECT}/grpc-server:latest

Let’s create some Deployments and Services. I recommend you start gradually:

PROJECT=[[YOUR-PROJECT]]
NAMESPACE=[[YOUR-NAMESPACE]] # fourhundred
kubectl create namespace ${NAMESPACE}
for NUM in $(seq -f "%03g" 0 9)
do
echo "Service: service-${NUM}"
kubectl run service-${NUM} \
--image=gcr.io/${PROJECT}/grpc-server:latest \
--port=10000 \
--namespace=${NAMESPACE}
kubectl expose deployment/service-${NUM} \
--protocol=TCP \
--port=10000 \
--target-port=10000 \
--type=LoadBalancer \
--namespace=${NAMESPACE}
kubectl label service/service-${NUM} grpc=true service=whoami \
--namespace=${NAMESPACE}
done
NB The Service uses port 10000, the Deployment’s Pods expose port 10000and the container in the Pod listens on 10000. This is equivalent to Docker’s --publish=10000:10000.

You may check progress using the commands I showed previously:

kubectl get deployments \
--namespace=${NAMESPACE} \
--output=name \
| wc --lines
kubectl get pods \
--namespace=${NAMESPACE} \
--output=name \
| wc --lines
gcloud compute forwarding-rules list \
--format="value(name)" \
--project=$PROJECT \
| wc --lines

NB The first command uses kubectl while the second command uses gcloud. The Network LBs (represented by forwarding-rules) take a little time to be created.

When you’re confident with what you have, you may bump the lower and upper bounds in the script and run it again.

Defer: Tidy-up

To delete the Deployments and Services:

NAMESPACE=[[YOUR-NAMESPACE]]
for NUM in $(seq -f "%03g" 0 9)
do
echo "Service: service-${NUM}"
kubectl delete deployment/service-${NUM} --namespace=${NAMESPACE}
kubectl delete service/service-${NUM} --namespace=${NAMESPACE}
done

NB Ensure you set the lower bound (currently 0) and the upper bound (currently 9) correctly to delete all your services.

You may rerun the commands shown previously to ensure you get Deployments, Services, Pods and LBs (forwarding-rules) down to zero. Alternatively, you can whack the namespace and this should delete everything:

kubectl delete namespace/${NAMESPACE}

Testing

We now have some number of Network LBs (forwarding-rules) that expose our gRPC service (itself running on port 10000) on port 9000 via a per-service Network LB. This script enumerates endpoints for all the services (some may not yet have IP addresses and these are ignored) and then it randomly picks one and makes a call against it. It’s not significant load-testing but it *is* load-testing :-)

NAMESPACE=[[YOUR-NAMESPACE]]
unset LB
LB=()
for IP in $(kubectl get services \
--selector=grpc==true,service==whoami \
--namespace=${NAMESPACE} \
--output=json \
| jq --raw-output '.items[] | .status.loadBalancer.ingress[0].ip | select(.!=null)')
do
LB+=(${IP})
done
while :
do
TEST_LB=${LB[$RANDOM % ${#LB[@]} ]}
./grpc-client --host=${TEST_LB} --port=10000
done
NB You should rerun the script in its entirety if you significantly revise the number of LBs (up or down). If you have the same number of LBs, some may have subsequently gained an IP address. If you simply want to run again using the same set of LBs as before, you may run the while loop only.

Optional: Cloud Endpoints

Google’s instructions are here:

https://cloud.google.com/endpoints/docs/grpc/get-started-grpc-kubernetes-engine

If you would like to deploy another instance of the gRPC service using Cloud Endpoints, you will need to

  • Enable Cloud Endpoints
  • Generate the proto descriptor
  • Deploy the Service Configuration
  • Deploy the gRPC WhoService *with* the Endpoints sidecar
  • Test it!
gcloud services enable endpoints.googleapis.com \
--project=$PROJECT

We must generate a proto descriptor (api_descriptor.pb) that accompanies the api_config.yaml file to create the Endpoints service. Here’s the steps I recommend you pursue from the working directory that contains the whoami directory which itself contains the whoami.proto and api_config.yaml files:

virtualenv env
source venv/bin/activate
pip install grpcio-tools
python -m grpc_tools.protoc \
--include_imports \
--include_source_info \
--proto_path=./whoami \
--descriptor_set_out=./whoami/api_descriptor.pb \
whoami.proto

Then deploy the Service Configuration:

gcloud endpoints services deploy api_descriptor.pb api_config.yaml \
--project=$PROJECT

Which — when successful — should return:

Service Configuration [2018-05-02r0] uploaded for service [whoservice.endpoints.[[YOUR-PROJECT]].cloud.goog]

Now the Cloud Endpoints knows what to expect of our API, we need to deploy the gRPC WhoService that implements it:

NB Please replace [[YOUR-PROJECT]] in line #20 with the value of your GCP project.

Then:

kubectl apply --filename=deployment.yaml

That you can confirm with:

kubectl get deployment/grpc-server
kubectl get service/grpc-server

We’ll reuse the mechanism used in the bash script previously to identify the IP address of the Network LB that is created by this deployment:

ENDPOINTS_IP=$(\
kubectl get service grpc-server \
--output=jsonpath="{.status.loadBalancer.ingress[0].ip}")
echo ${ENDPOINTS_IP}

Because it takes a little time for the Network LB to be provisioned, repeat the command until you get an IP. And, all being well:

while :
do
./grpc-client \
--host=${ENDPOINTS_IP} \
--port=9000
done
NB With Endpoints we communicate with the sidecar and it communicates with our gRPC server. To make this clear, we’re using 9000 for the proxy while our service remains on 10000.

Endpoints provides us with a dashboard:

Cloud Endpoints
Cloud Trace

…and traces!! Woot

Stackdriver Kubernetes

Had this been announced 2 days ago, I would have been able to show you this cluster with the awesome-looking new Kubernetes Monitoring :-(

gRPC Whoami

Eesh, I cribbed this from someone|somewhere and don’t recall where. Apologies for not crediting you by name, dear developer :-(

I sought a trivial gRPC service to emulate the customer’s experience. I am also deploying an instance of this service using Cloud Endpoints.

NB Please replace lines 8 of client.go and server.go the [[YOUR-GITHUB]] with your github.com/your-name directory path. This assumes you’re using GitHub but it’s likely if you’re using Go. If you’re not using GitHub, you must simply reflect the directory structure of the gRPC service. I propose my-working-dir/client, my-working-dir/server and my-working-dir/whoami. The latter should contain the whoami.proto and the api_config.yaml.
NB please replace line 4 of api_config.yaml the [[YOUR-PROJECT]] with the value of your Google Cloud Platform project in which you will deploy the Endpoints service.

For convenience, I’ve included the machine-generated whoami.pb.go file above. This saves you having to generate this file from the whoami.proto file and makes this blog post slightly shorter. If you’re interested, please learn more about this process here:

https://grpc.io/docs/tutorials/basic/go.html

You *should* be able to go-get then run the files:

go get ./...
go run server/server.go
go run client/client.go
NB You’ll need to run the client in one window after starting the server in another.

If that works, we can build and try again:

CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o grpc-client client/client.go
CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o grpc-server server/server.go

Or: Does a loop for 2 need a loop?

for FILE in client server
do
CGO_ENABLED=0 \
GOOS=linux \
go build -a -installsuffix cgo -o grpc-${FILE} ${FILE}/${FILE}.go
done

If that works, we can containerize and try again:

NB You don’t need to containerize the client, only the server.
NB Dockerfile.client uses a multi-stage build. Both use dumb-init. If you don’t want to use dumb-init remove line 12+4 respectively. In the client file replace line 15 with ENTRYPOINT ["/grpc-client"]. In the server file delete line 8 and leave the CMD.
docker build \
--tag=gcr.io/${PROJECT}/grpc-client \
--file=Dockerfile.client \
.
docker build \
--tag=gcr.io/${PROJECT}/grpc-server \
--file=Dockerfile.server \
.
docker push gcr.io/${PROJECT}/grpc-client
docker push gcr.io/${PROJECT}/grpc-server

Test

The server:

docker run \
--interactive \
--tty \
--publish=10000:10000 \
gcr.io/dazwilkin-180420-trillian/grpc-server

If you containerized the client then:

docker run \
--interactive \
--tty \
--net=host \
gcr.io/dazwilkin-180420-trillian/grpc-client \
--host=localhost \
--port=10000

If you did not containerize the client then:

./grpc-client --host=localhost --port=10000

Aside: Nick’s hack

Stupidly, I’d forgotten I’d also created the Endpoints service and wondered why I had 401 (rather than 400) network LBs ;-(

Thanks to Nick — a really awesome Kubernetes Engineer — who pointed out this useful filter of the LBs which shows that, while I have 400 LBs called service-XXX in namespace fourhundred (NB that the LB description preserves the Kubernetes namespace/service name == cool), I also have one LB in default created by Endpoints:

gcloud compute forwarding-rules list \
--project=${PROJECT} \
--format="table(description)"
DESCRIPTION
...
{"kubernetes.io/service-name":"fourhundred/service-009"}
{"kubernetes.io/service-name":"default/grpc-server"}
{"kubernetes.io/service-name":"fourhundred/service-010"}
{"kubernetes.io/service-name":"fourhundred/service-011"}
{"kubernetes.io/service-name":"fourhundred/service-012"}
{"kubernetes.io/service-name":"fourhundred/service-013"}
{"kubernetes.io/service-name":"fourhundred/service-014"}
{"kubernetes.io/service-name":"fourhundred/service-015"}
...

Conclusion

I think what’s most tellling from this experience is that Kubernetes with 400 services is little different from Kubernetes with 4 services. Also notice how we’re mostly oblivious to the Nodes that are powering this service and the GCE VMs that manifest them.

Tidy-up

The “scorched earth” tidy-up is to delete the GCP project. This will delete everything in the project. If you created the project exclusively for this work, well done. Proceed at your own risk:

gcloud projects delete ${PROJECT} --quiet

Alternatively, you can delete the Kubernetes cluster and the Endpoints service. Again, if you’re using either for other purposes, be aware that you’ll delete EVERYTHING. Proceed at your own risk:

gcloud container clusters delete $CLUSTER \
--project=${PROJECT}
gcloud endpoints services delete \
whoservice.endpoints.${PROJECT}.cloud.goog \
--project=${PROJECT}

That’s all!