This post describes the reasons to run a brand new Warsaw Data Engineering meetup group (and shutting down the existing group Warsaw Scala Enthusiasts). Join us!

Image for post
Image for post

I’ve been thinking about this for some time, but somehow couldn’t find enough courage. As new resolutions for the new year 2020 pilled up it was more evident that I need to clean up the professional activities of mine a little bit (else I’ll be doomed to fail early and often).

On Sunday, Feb 2nd, I announced a new meetup of the Warsaw Scala Enthusiasts group that was to be held on Tue, Feb 4th. You’re reading it right, that happened not earlier than 2 days before the meetup! I was surprised how quickly and how many people started to sign up! In the end, 16 people signed up (I consider it a personal record given all the things I’m describing in this post). …

For now…

Until I figure out how to make all “The Internals Of” online books available under a single root domain, e.g. Consider it a WIP and part of my resolutions for 2020.

Having said that, as of today, the repository with the sources of “The Internals Of Apache Spark” (formerly “Mastering Apache Spark”) are available in the repository apache-spark-internals under the japila-books organization.

The book is published (via GitHub Pages) to (which is the default name for github pages).

That’s my very first time when I created an organization on GitHub and transferred ownership of a project. It seems working fine so far. Sorry for any troubles it may have caused. …

tl;dr Stay away from Sphinx and RST.

I’m coming from Markdown and Asciidoc markup languages and can’t think of any reasons why I should keep writing documentation using reStructuredText (RST) markup language. After just a couple of days I felt as if I was going backwards.

I’m quite surprised how quickly I came to this conclusion. It took me a mere couple of days, and could attribute to the markup alternatives I tried so far.

A few of the reasons I’m leaving Sphinx and RST are as follows:

  1. Different formatting for headers, sections, and paragraphs (unlike different number of hashes or equals in Asciidoc or…

Thanks Jakub Wszolek, Mariia Ruban and the entire conference crew for DataMass Summit 2019 conference in Gdansk, Poland on October 4th, 2019. I was invited as a speaker and enjoyed it a lot.

Me at DataMass Summit 2019
Me at DataMass Summit 2019

Not only could I give the talk about “Modern Stream Processing Engines Compared — Kafka Streams VS Spark Structured Streaming” (my lovely ones), but also had an opportunity to have a word with the attendees. It turned out that there is quite an interest in these two stream processing engines and the obvious question came up fairly often “which one to use and when?”

The conference was different from the others for a couple of reasons (and I really hope they all can help me becoming an even better speaker at the conferences to come, e.g. Spark + AI Summit Europe 2019 in Amsterdam in two weeks). …

I speak French. Not really. But I will soon.

I wish I could write “I learn French” actually, but that’s something I haven’t learned yet during my lessons.

I use Duolingo mobile app and I enjoy it a lot. It gives me all I wanted for a French novice like me with basic words and sentences as well as leagues where you compete against others. Gamification at its best!

Image for post
Image for post
My profile at Duolingo

I’ve long been thinking about learning a new speaking language (I said “a speaking language” since I’m in IT industry and have been learning quite a few programming languages already). I needed something new and challenging. I was considering one of the Asian languages. …

Image for post
Image for post
The Opening Slide for the 3-Day Apache Kafka Workshop

When: 13–15.02.2019
Where: Győr, Hungary

Just a few days before the winter holidays with my family I got a call to host an Apache Kafka Workshop in Győr, Hungary. I’d never been in this city before and was ready to take on another Kafka workshop (after this and this workshops and some projects under my belt) so I (almost) immediately accepted the offer.

Image for post
Image for post
The Team

A new client, a new city, and a new agenda, but the content was “old”, i.e. Apache Kafka and Kafka Streams (with some very basic bits on Kafka Connect, KSQL, Avro and Schema Registry). …

That’d been not so long since the first workshop with Apache Kafka and Kafka Streams when I was asked to run a workshop with Apache Kafka. I’ve had more and more workshops with Apache Kafka recently (mostly for developers), but the recent inquiry was completely different — I had to prepare a workshop as much for developers as for administrators (!) …

I could not have dreamt of a better shout-out after my recent work with Kafka Streams that I’ve just received from the one and only Gwen Shapira from Confluent (the company behind Apache Kafka and Kafka Streams).

I’m an independent consultant specialising in Apache Spark with some focus on the tools people use with it. Fairly often it is Apache Kafka that is the “shock absorber” for the large amount of events or simply the “storage” and so over time Kafka has found a special place in my heart. …

During the past 5 days (June 11–15) I was giving “Introduction to Scala” workshop to a group of 10 people from different countries in South America (aka Latam).

The workshop was held online from 6pm to 10pm Poland time due to time zone difference (the difference was -5 hours if I’m not mistaken).

We used Skype for Business as a communication tool with the Conversation panel for most of our discussions. The participants could also have used their microphones, but that happened only at the beginning of every session. It worked quite fine, but the main challenge was that I could not have seen people’s faces and know ahead how we’re doing. …

What a long Apache Spark day! The group of 3 Spark developers with me as the instructor started the 1-day Spark Structured Streaming 2.2 workshop right at 9am and finished at 8pm.

Image for post
Image for post

That gives 11 hours exclusively with Apache Spark’s brand new stream processing engine Spark Structured Streaming and Scala. I didn’t expect we could’ve spent so long and have covered so much. That was as much exhaustive as exhausting.

Thanks Gorazd, Dinko, Dario and Gordan for bearing with me for so long!

Image for post
Image for post
From left: Dinko, Gordan, Jacek, Dario

The whole agenda is available at Spark Structured Streaming in Apache Spark 2.2 …


Jacek Laskowski

Spark and Kafka Consultant, Developer and Trainer (publishing at and

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store