Dennis de Weerdt – Medium

Dennis de Weerdt

Dennis de Weerdt
in
DataPebbles

Portable Pipelines with Apache Beam

There are lots of use cases for data processing and analytics pipelines, and nearly as many frameworks to use. Apache Spark is probably…

Dec 26, 2020

Portable Pipelines with Apache Beam

Dec 26, 2020

Dennis de Weerdt
in
DataPebbles

Kafka to Spark Structured Streaming, with Exactly-Once Semantics

Apache Spark Structured Streaming is a part of the Spark Dataset API. This is an improvement from the DStream-based Spark Streaming, which…

Nov 2, 2020

Kafka to Spark Structured Streaming, with Exactly-Once Semantics

Nov 2, 2020

Dennis de Weerdt
in
DataPebbles

Data Quality Dashboards: Is Your Data Doing Ok?

Everyone loves data dashboards, right? Fancy visualisations which provide key insights into otherwise opaque data? Of course you do.

Sep 28, 2020

Data Quality Dashboards: Is Your Data Doing Ok?

Sep 28, 2020

Dennis de Weerdt
in
DataPebbles

Partitioning and Bucketing in Hive: Which and when?

Lately, I've been getting my feet wet with Apache Hive. Two of the more interesting features I've come across so far have been…

Sep 16, 2020

Partitioning and Bucketing in Hive: Which and when?

Sep 16, 2020

Dennis de Weerdt
in
DataPebbles

Moving Spark into Kubernetes

In my previous post, I discussed how to write a simple Spark application in Kotlin, and run it with Airflow. This time around, let's see…

Aug 26, 2020

Moving Spark into Kubernetes

Aug 26, 2020

Dennis de Weerdt
in
DataPebbles

Spark and Airflow with Kotlin

Recently, I was thinking about something new I could learn, and I ended up with two options. The first was to try working with Apache…

Aug 6, 2020

Spark and Airflow with Kotlin

Aug 6, 2020

Dennis de Weerdt

Dennis de Weerdt

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams