This post will provide the Sample code (Python) to consume Kafka topics using Azure Databricks (Spark), Confluent Cloud (Kafka) running on Azure, Schema Registry and AVRO format.

Reading the topic:

Image for post
Image for post
Kafka Topic

Stream Data formatted and stored in a Spark SQL Table (view):

Image for post
Image for post
Topic Curated Data

Source code:


Special thanks and credits to Gianluca Natali, Henning Kropp, Yatharth Gupta, Bhanu Prakash, Awez Syed, Nick Hill, Robin Davidson, Liping Huang, Chris Munyasya, Sid Rabindran and many more people from the Databricks, Confluent and Microsoft team engaged to make this integration to work.

Most people are aware that I love Unix/Linux and open source. I have used Linux and Open Source for more than 16 years.

Some years ago, I had the privilege to meet and talk to Linus Benedict Torvalds (Linux and Git creator).

Image for post
Image for post
Caio Moreno and Linus Benedict Torvalds (Linux and Git creator) Instagram photo

Back in 2007, I created a company in Brazil/Spain to provide real world solutions based on open source.

Later, I worked at Pentaho, an open source big data, business intelligence and data mining company sold to Hitachi (the reason probably was because of Pentaho’s open source portfolio of products and solutions).

Going back to the past (2 June, 2005) I found this BBC…

This demo will show how to use the Microsoft Azure Cognitive Services to convert audio files (.wav format) to text.

GitHub code here.

Image for post
Image for post
Azure AI Speech to Text Demo

Speech to Text Demo using Microsoft Azure Cognitive Services

Azure AI Demo

Source code here.


Caio Moreno

Senior Cloud Solution Architect and Data Scientist @microsoft | PhD Student @unicomplutense (Opinions are my own)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store