How to deserialize AVRO messages in Python Faust?

Xiaoxu Gao
The Startup
Published in
2 min readMay 17, 2020

--

Faust is a stream processing library, porting the ideas from Kafka Streams to Python.

I use Faust in my day-to-day job to consume messages from Kafka topics, then perform certain transformations.

However, there is a problem in Faust library which is lacking the official support of AVRO deserializer.

Currently, supported codecs are:

So, how do we handle serialized AVRO messages in this case? My initial thought would be: Can we read raw bytes from the topic, and then deseralize each message manually?

Manual Seralization

The Faust documentation indeed mentions Manual Seralization. The idea is to read the message in the agent and then deseralize it.

I use fastavro library for AVRO seralization. fastavro has much better performance than the official Apache Avro Python package. The Apache Avro Python is written in pure Python while fastavro is written in CPython.

The sample code is in the following Gist.

--

--

Xiaoxu Gao
The Startup

I’m a Developer with a focus on Python and Data Engineering. I write stuff to talk to myself and the world. You can find me on linkedin.com/in/xiaoxugao/.