Dagster: Data Engineering Best Practice during Slack Time

OTTO Tech
OTTO Tech
Published in
3 min readNov 15, 2023

In the inc(AI) team, we always try different tools and methods. The slack time has proven a good means for this, as we learn on our own or together with others.

Of course, the motivation for such slack time originates from the challenges we have to meet at work. Our strategy is to first of all look at the open-source community to find solutions that suit us. The background: The probability is very high that other people already have solved a similar problem. Moreover, the solutions should fulfill our criteria: be simple, scalable and production-friendly.

So, we examined a new tool called Dagster in our slack time a while ago. Dagster is an orchestration tool that controls and manages various tasks in the area of data and ML pipelines. Dagster controls where and when the different steps of a pipeline will be carried out and stores meta data along the way.

Abb. 1: Data Lineage eines Use Cases

The use of Dagster contributes to OTTO’s moal (mid-term goal) of “effective and efficient organization”. In our area in BI, we focus on “best practices for leveraging technical synergies”.

This is a standard task of our day-to-day work. Comparable tools were available also before Dagster, and they are used in BI with their different specifics. However, Dagster does quite a few things differently. For example, Dagster introduces the concept of “software assets”, which improves the structuring of pipelines, and increases reusability. Moreover, the data lineage can be visualized in a simple manner — a major advantage when it comes to troubleshooting.

Fig. 2: Meta data of an asset

Dagster can be easily installed and run on a local computer. This property is advantageous especially when a pipeline is developed, because it significantly shortens test cycles. A Dagster cloud instance then ensures productive operation. Dagster’s web user interface is very user-friendly, so that even non-technophile people can operate the production line.

There is no simple path without obstacles:

Of course, we had some difficult moments also with Dagster. One example is the nature of open-source software: Very fast development results in incompatibility and instability. We jointly discuss in detail to find out what change or fix would be needed for our project. After that, we share our findings.

Once we had successfully tested Dagster also in our internal projects, and had handed over Dagster to individual teams, we were able to successfully migrate an existing use case from Argo to Dagster together with team Warp.

At the moment, several teams are evaluating the use of Dagster for their use cases, and interest is growing. So is the group of people who exchange their views on Dagster, and support each other in their day-to-day work. This is a double hit for inc(AI): We create a solution for us, and also for the other teams.

Find this one and more interesting articles in our Tech Blog.

Written by Team inc(AI)

--

--

OTTO Tech
OTTO Tech

Take a deep dive into our tech challenges at OTTO, the e-commerce platform based in Hamburg, Germany