Redpanda: Transforming Messages Made Easy
Introduction
Let’s talk about Redpanda, a powerful tool for handling and transforming streaming data. It’s designed to be user-friendly and integrates well with Apache Kafka. Whether you use it as a service in the cloud or on your own setup, Redpanda simplifies data processing in real-time streams.
What is Redpanda?
Redpanda is a service that works with Apache Kafka’s API. Traditionally, if you wanted to modify data in a Kafka stream, you’d need to set up external services like Flink or AWS Lambda. These services would subscribe to your data stream, transform the data, and then produce it back to a new stream. Redpanda now offers two simpler ways to perform these transformations directly.
Transforming Messages with WASM
One of the innovative features of Redpanda is its WebAssembly (WASM) engine. WASM is a container runtime that, unlike Docker, can run anywhere with better efficiency and security. By integrating a WASM engine, Redpanda allows you to perform data transformations directly on the Kafka brokers. This means you can safely run third-party code with controlled CPU and memory usage.
Redpanda provides SDKs for Golang and Rust, enabling developers to transform messages as they enter the stream. These transformations can only use information from the message itself or predefined data, ensuring high performance without the need for external service calls.
Ideal Uses for WASM in Redpanda:
- Masking Data: Remove sensitive information from messages.
- Reformatting Messages: Simplify data for downstream processing. For example, only forward messages with specific conditions to reduce costs.
- Validating Messages: Ensure incoming data meets required standards before further processing, preventing errors and saving resources.
Redpanda Connect with Benthos
Redpanda recently acquired Benthos, a tool for streaming ETL (Extract, Transform, Load) that works not only with Kafka but also with various other streaming services. Benthos is designed to be efficient and lightweight, requiring fewer resources compared to traditional Kafka Connect clusters.
Benthos ensures strong message consistency and retry logic, reducing the risk of data loss. It supports Dead Letter Queues for handling problematic messages. With hundreds of connectors, Benthos can write data to Kafka and third-party services, making it highly versatile.
One of Benthos’ standout features is its simple configuration system, where you can define sources, destinations, and transformations using a YAML-like syntax. It also includes Bloblang, a powerful tool for unpacking and transforming messages based on your rules.
Although Benthos is not integrated directly into the broker like WASM, it operates close enough to offer fast processing. It also provides advanced features not available in WASM, covering many common use cases without the need for specialized expertise.
Conclusion
Redpanda and its tools like WASM and Benthos make transforming and processing streaming data simpler and more accessible. They offer powerful capabilities while reducing the need for extensive resources and specialised skills, making real-time data transformation easier and more cost-effective.
Please continue to follow my articles as this will be followed with articles on how to write an AVRO transformation with WASM, how to test WASM transformations with go-testcontainers and how to write a transform with Redpanda Connect.