Connecting to a kafka cluster running in kubernetes from outside

Valer Cara
2 min readJul 31, 2018

--

Ok, so trying to connect your machine to a remote kafka cluster running in kubernetes. What are the options? There’s port forwarding with kubectl, VPN-ing to the cluster, telepresence, …

We’ll look at the hard way, using port forwarding & DNAT to the kafka pods, mostly for fun.

The hard way

…with a single broker

kubectl port-forward svc/kafka 9092 opens a localhost:9092 port, but if you connect a kafka consumer to it, the kafka protocol will point the consumer to the broker’s IP address in the cluster and it’ll fail:

$ kubectl port-forward svc/kafka 9092$ kafkacat -C -b localhost -t foobar -o -1 -c 1%3|1533029229.817|FAIL|rdkafka#consumer-0| 100.111.134.183:9092/bootstrap: Failed to connect to broker at 100.111.134.183:9092: Network is unreachable
%3|1533029229.817|ERROR|rdkafka#consumer-0| 100.111.134.183:9092/bootstrap: Failed to connect to broker at 100.111.134.183:9092: Network is unreachable
%3|1533029229.817|ERROR|rdkafka#consumer-0| 1/1 brokers are down
% ERROR: Failed to query metadata for topic foobar: Local: Timed out

100.111.134.183 is an internal cluster IP, which is inaccessible from your laptop. You can however DNAT that IP to localhost to force it through the tunnel like so:

$ iptables -t nat -I OUTPUT -d 100.111.134.183 -j DNAT --to-destination 127.0.0.1

Now our consumer will merrily connect and fetch data from kafka via the tunnel:

$ kafkacat -C -b localhost -t foobar -o -5 -c 3
foo
bar
baz

For osX, check out this SO answer detailing on how to do DNAT.

Tzapulica kindly wrote a gist for setting this up on osX. Thanks tz! :)

…with multi-broker

This will require port-forwarding all broker pods & DNAT rules for each.

I’ve written a proof of concept script here. (it’s only for linux).

The easy way

….and now, after all that, there’s Telepresence, which is infinitely easier to setup and use:

Telepresence substitutes a two-way network proxy for your normal pod running in the Kubernetes cluster. This pod proxies data from your Kubernetes environment (e.g., TCP connections, environment variables, volumes) to the local process. The local process has its networking transparently overridden so that DNS calls and TCP connections are routed through the proxy to the remote Kubernetes cluster.

telepresence --run-shell

--

--