KafkaGoSaur: a WebAssembly powered Kafka client

A new Kafka client for Deno built using WebAssembly

Arjun Dhawan

--

KafkaGoSaur is a new Kafka client for Deno built with WebAssembly on top of kafka-go, the excellent Kafka client library written for Go. This article explains the basic usage of KafkaGoSaur and shines a light on some of its inner workings.

kafka-go

Go is a minimal yet powerful language. Its simplicity has driven its adoption in recent years by both startups and enterprises alike as it allows to build scalable and performant software fast. A useful standard library, modern tooling, and high-quality third-party libraries make it one of the most wanted languages to work with.

One of these third-party libraries is kafka-go. An efficient and simple to use Kafka client developed by Segment. It features both a low- and high-level API.

Deno

Lesser known is Deno, a modern runtime for JavaScript and TypeScript focussing on great developer experience. Created by Ryan Dahl, it fixes long-standing issues and regrets that were introduced when he build Node.js. Deno is web-compatible wherever possible, meaning it runs WebAssembly binaries out of the box!

WebAssembly

WebAssembly (WASM) is a binary instruction format that serves as a universal compilation target. In simple terms, it allows code from almost any language to be run in the browser or any compatible environment like Deno.

Mix up all 3 technologies by compiling kafka-go to a WebAssembly binary and you get KafkaGoSaur: a new Kafka client for Deno that is ready to go.

Usage

Producing

Producing a message is simple. Message values are binary encoded and are produced in batch using writeMessages:

import KafkaGoSaur from "https://deno.land/x/kafkagosaur/mod.ts";const kafkaGoSaur = new KafkaGoSaur();
const writer = await kafkaGoSaur.createWriter({
broker: "localhost:9092",
topic: "test-0",
});
const encoder = new TextEncoder();
const messages = [{ value: encoder.encode("Hello!") } ];
await writer.writeMessages(messages);

Consuming
Messages are consumed one by one using readMessage:

import KafkaGoSaur from "https://deno.land/x/kafkagosaur/mod.ts";const kafkaGoSaur = new KafkaGoSaur();
const reader = await kafkaGoSaur.createReader({
brokers: ["localhost:9092"],
topic: "test-0",
});

const message = await reader.readMessage();

WebAssembly

Can we use WebAssembly to port kafka-go to any language or runtime other than Deno? In principle, yes. But there is a limitation on WebAssembly stemming from the browser environment. Kafka communicates using TCP, which is not supported by browsers. Even though browsers support WebSockets, this web equivalent is not directly supported by Kafka brokers.

That’s why KafkaGoSaur exposes TCP functionality of its host — the Deno runtime — to kafka-go. Go exchanges the functions needed with the Deno runtime through the global object using syscall/js.

Example of functions exchanged in WebAssembly in KafkaGoSaur.

In essence, the exchange of functions is what happens when a new KafkaGoSaur instance is created. The constructor of KafkaGoSaur runs the WebAssembly binary that makes available the API of kafka-go in Deno.

KafkaGoSaur can use two different socket implementations for TCP: Deno’s Socket API (Deno.connect) or the net module (createConnection) from the Node.js compatibility layer. They are used to construct a DialFunc: a kafka-go function to create a net.Conn. By default, the node Node.js implementation is used but switching is easy. Just specify the one you want to use when creating the reader or writer:

const reader = await kafkaGoSaur.createReader({
brokers: ["localhost:9092"],
topic: "test-0",
dialBackend: DialBackend.Node
});

There is one limitation for DialBackend.Deno: producing messages is not supported yet. Somewhere in the implementation of Deno.connect seems to be a bug that causes broken pipe errors. The issue is being investigated.

Promises and goroutines

While Go achieves concurrency through goroutines which can spawn multiple threads, concurrency in JavaScript is modeled on a single-threaded event loop using Promises. That makes concurrency in JavaScript and Go inherently different.

Promises created in Deno need to be awaited in Go. That is done by sending its resolved value into a channel, which inherently blocks the current goroutine. Any function defined in Go can be invoked from JavaScript by wrapping it using js.FuncOf, turning it into a regular JavaScript value. Invoking the wrapped function in JavaScript affects the execution model in both languages:

  1. The event loop of the JavaScript runtime gets paused.
  2. A new goroutine is spawned, executing the Go function.
  3. The event loop resumes when this function returns.

But there is a catch. Any other function wrapped using js.FuncOf will be executed on the very same new goroutine. This poses a problem when js.FuncOf wraps a Go function that calls (and awaits) a blocking JavaScript API. An example of such API would be fetch or read on a TCP connection. These APIs are asynchronous, meaning their return value resolves not now but some moment later in the future. Asynchronous functions in JavaScript rely on the event loop to process their return value whenever it resolves.

Thus when the event loop gets explicitly paused due to invocation of the wrapped function, it ends up in deadlock as the function defined in Go relies on the (never occurring) resumption of the event loop to return.

That’s why it is the responsibility of the caller of js.FuncOf to start a new goroutine to wrap any blocking function. This allows the wrapped Go function to return immediately with a Promise. This immediate return resumes the event loop so it can process the Promise when it resolves. Take a look at the interop package to see how functions NewPromise and Await respectively wrap blocking functions and await JavaScript Promises in Go.

Performance

KafkaGoSaur can write in batches nearly as fast as kafka-go, but reading suffers roughly a 50% performance penalty:

Read and write performance comparison. Tested on a Confluent Cloud Basic cluster.

The stats function (backed by its Stats counterpart) reports the so-called wait and read times and sheds a light on why this happens. The wait time is the time that is spent waiting for a batch of messages to arrive. The read time is the time it takes to read all the messages from this binary response. Kafka-go spends on average 15 ms waiting for a batch and 160 ms reading its messages. For KafkaGoSaur this is 20 ms waiting and 360 ms reading. Thus most of the performance penalty is incurred when KafkaGoSaur parses messages from the already fetched batch response.

Next steps

The exact cause for the performance degradation in the case of reading needs to be still uncovered. But a potential solution is luckily offered by the dual low- and high-level API that kafka-go offers. If the WebAssembly compiled function to read messages from the batch response is ill-performing, the same functionality can simply be reimplemented directly in Deno by making use of the low-level (but still performant) batch fetching.

Even though KafkgaGoSaur is in the early stage of development, your contributions are highly valued and welcomed! Be an early adopter and feel free to ask for new features, report bugs, or submit your code 🙂.

--

--