Speaking with friends of mine I often hear: “Oh Kalman Filters… I usually study them, understand them and then I forget everything”. Well, considering that Kalman Filters (KF) are one of the most widespread algorithms in the world (if you look around your house, 80% of the tech you have probably has some sort of KF running inside), let’s try and make them clear once and for all.
By the end of this post you will have an intuitive and detailed understanding of how a KF works, what’s the idea behind it, why you need multiple variants and what are…
I recently started to use BigQuery and I must admit I fell in love with the DB…
This article is my attempt to explain the technology behind it, which is a requirement to efficiently utilise the DB in terms of cost and performance.
BigQuery is the public implementation of Dremel that was launched by Google to general availability.
Dremel is Google’s query engine and it is able to turn SQL queries into an execution tree which reads data from Google’s distributed filesystem. …
In every startup comes the time when you have to design a large-scale ingestion of analytics events.
The main idea is that you have a backend logging events and you want to have analytics insights about them.
This is pretty much one of the classic data engineering tasks.
The requirements you have in startups might sound like this:
This post is based on a talk I recently gave to my colleagues about Airflow.
In particular, the focus of the talk was: what’s Airflow, what can you do with it and how it differs from Luigi.
It’s really common in a company to have to move and transform data.
For example, you have plenty of logs stored somewhere on S3, and you want to periodically take that data, extract and aggregate meaningful information and then store them in an analytics DB (e.g., Redshift).
Usually, this kind of tasks are first performed manually, then, as things need to scale up…
THIS IS A TRUE STORY.
The events depicted in this blog post
took place in London in 2018.
At the request of the survivors,
the function names have been changed.
Out of respect for the dead code,
the rest has been told exactly
as it occurred.
Today I stumbled on a very good Go question.
Someone was working with a library, where a function had some hard-coded parameters.
The author of the library defined the function following a convention over configuration approach. If you are not familiar with this design pattern, basically it aims at decreasing the number of decisions…
The aim of this post is to show you how amazing are empty structs in GO and I will do that showing you some examples.
An empty struct is a struct type without fields
The cool thing about an empty structure is that it occupies zero bytes of storage.
You can find an accurate description about the actual mechanism inside the golang compiler in this post by Dave Chaney.
The short version is:
The size of a struct is the sum of the size of the types of its fields, since there are no fields: no size!
This article has been inspired by this great talk by Dave Chaney, my personal frustration on a day at work and some other references on the net.
I do not know how many times I’ve written this code in Python.
The purpose it’s easy:
we have some task to do (e.g., some resources to access) and we decide that we want to do it in parallel.
As a rule of thumb, if the tasks we want to run are not CPU heavy (a good example are IO operations) our best bet are threads.
I promised some…
I stole this line from a friend of a friend working in the army…that’s his company motto and that should become the motto of any software engineer out there.
An easy rephrasing would be: don’t be lazy.
Hundreds of times a day, we face a choice: be lazy or do the right thing.
Deep inside, we know what we should do…almost every time.
The right thing is also usually the hard thing, the one that makes us feel that mental strain that we don’t want to feel.
Cal Newport, author of “So Good they Can’t Ignore you” and “Deep Work”…
I love GO.
I love how it is a programming language written by engineers for engineers.
I love how it is focused on functionalities and forces you to think in terms of them when you engineer your code.
I started using GO few months ago, when I moved from building data analysis pipelines in Python to building them in GO.
I believe that GO provides you with an overall better infrastructure management, maybe trading off with some analysis power, compared to Python. Please note that this is very application dependent.
This article is a random compilation of nice things I…
Since I started to improve my software engineer skills, one thing that always puzzled me is: how to write a good function.
The usual comment you hear about writing good functions is:
A function should do one thing and one thing only (possibly it should do it right).
I have always found this comment kind of counterintuitive: How do you define one thing?
Suppose you are writing a function to import some variables from a file into a structure (basically an unmarshalling function of some sort). The function is doing one thing: parsing variables from a file…So do you need…
Tech enthusiast, life-long learner, with a PhD in Robotics. I write about my day to day experience in Software and Data Engineering.