Originally posted to my personal blog here.

Databricks Notebooks are commonplace at QueryClick, they’re the combination of notebooks and PySpark for EDA and simple Spark jobs. Notebooks are controversial to say the least; they encourage bad coding standards and have a non-linear flow to their code.

However, they do have their uses.

The Problem

One of our clients runs Adobe Analytics as one of their analytics tools and they wanted to use the data collected by Adobe in our attribution solution. …


Image for post
Image for post

Originally posted to my personal blog here.

As part of the Microsoft Partner Hack in November 2020, I decided to use this opportunity to try out a new method of ingesting Fluentd logs.

What is Fluentd?

Fluentd is a log collector which takes a declarative config file containing input (or “source”) and output information. Wikipedia defines it as:

a cross platform open-source data collection software project

The main idea is it allows developers to collect log information and send it to a given endpoint of their wishes without having to worry about the implementation of the log collection service itself.

How do we use it at QueryClick?

At QueryClick, we…

Phil Marius

Data Engineer @ QueryClick

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store