Originally posted to my personal blog here.
Databricks Notebooks are commonplace at QueryClick, they’re the combination of notebooks and PySpark for EDA and simple Spark jobs. Notebooks are controversial to say the least; they encourage bad coding standards and have a non-linear flow to their code.
However, they do have their uses.
One of our clients runs Adobe Analytics as one of their analytics tools and they wanted to use the data collected by Adobe in our attribution solution. …
Originally posted to my personal blog here.
As part of the Microsoft Partner Hack in November 2020, I decided to use this opportunity to try out a new method of ingesting Fluentd logs.
Fluentd is a log collector which takes a declarative config file containing input (or “source”) and output information. Wikipedia defines it as:
a cross platform open-source data collection software project
The main idea is it allows developers to collect log information and send it to a given endpoint of their wishes without having to worry about the implementation of the log collection service itself.
At QueryClick, we…
Data Engineer @ QueryClick