How we work at data.works — Meet Sven — Data Engineer

Published in

Otto Group data.works

4 min readAug 5, 2022

Are you interested in data and cloud technology? Would you like to understand more about different job profiles? Are you looking for a new job? This blog post series is about our way of working and our job profiles at Otto Group data.works.

Today, meet Sven Peeters, Data Engineer at Otto Group data.works since February 2021.

Hi Sven! Can you introduce your job in 3 sentences?

In short, as a Data Engineer I am responsible to design and implement new features for different data-driven services. Our products range from computing customer profiles in a large scale to hosting HTTP services which categorize content of webpages using machine learning. The data we provide is mainly used to enrich programmatic advertising with behavioral and contextual information.

Sounds interesting! Give us some more details into your working day!

My daily work is manifold and hard to describe to the full extent in this short blog post. In my team, I work closely together with Machine Learning Engineers and Data Scientists. Hence, my tasks are often related to machine learning in addition to usual data and software engineering topics, which can be summarized in the following areas:

Design and implementation of Big Data ETL-Pipelines which make predictions using Python, Java, Kubeflow Pipelines, BigQuery SQL, Dataflow and Kubernetes (GKE)
Design and implementation of microservices using Java, Python, Cloud Functions, Cloud Run, AppEngine and Kubernetes (GKE)
Management of infrastructure in the Google Cloud Platform with Terraform
Maintenance including bug fixes, patching and support of all products previously built by me and my team
Closely working together with our external clients and Otto Group companies for optimal customer satisfaction

A recent example is a prototype development of a user-group-based targeting solution for the cookie-less world. I was mainly engaged in the design and implementation of the Kubeflow Pipelines, which dispatches BigQuery SQL queries to compute the user groups with similar interests in our data warehouse and exporting them using Dataflow.

Besides coding and drafting cloud architectures, I spend some time every day sharing best practices and technology news or I just play a match of kicker with my great colleagues from data.works.

What are you most proud of since starting in our team?

I am most proud of the rework of the full text search backend of our contextual targeting product. Due to heavily increasing load on the product our old PostgreSQL backend was stumbling and reached the end of scalability. Despite me being a Junior Data Engineer at that time, my team enabled me to take a leading position in the design and development of the new backend. This included choosing Apache Solr as our new full text search engine, designing the cloud architecture, and migrating or rewriting affected components of the old backend. The new backend has been online since April and runs like a charm.

You already mentioned many different technologies that you’re using. What is your favorite one and why?

In our daily work we use many recent and impressive technologies like Kubeflow Pipelines, Argo or Apache Beam on Dataflow. Hence, choosing one that stands out is not easy. If I need to select one as my favorite, I would choose Dataflow in conjunction with Java. The serverless approach of Dataflow and the pipeline semantics of Apache Beam which unifies batch and stream processing offers me the opportunity to write Java code that processes a huge amount of data without spending hours to think about how I can scale the program and distribute the processed data over machines effectively. Jobs that would run days on a single machine are seamlessly executed in some minutes on over 100 machines in parallel. It’s always fun to see how my code gets distributed and executed in parallel over a huge number of machines in the Google Cloud Platform.

Sounds like you’re really passionate about your job! And what are you doing when you close your work laptop?

In my free time I like to play table tennis with friends in my local table tennis club. Besides playing table tennis, I have a huge passion for motorsport. I watch almost every session of each Formula 1 weekend and love to spend some time on virtual motorsport circuits.

To sum it up, what is the best about working at Otto Group data.works?

Otto Group data.works offers me the opportunity to pursue and learn a lot about my passion in big data processing and machine learning while developing smart and cool large scale data products in a young and dynamic team of data enthusiasts.

If you think that Sven’s job sounds interesting to you, then have a look on our open job offers! We can also recommend to read through our blog and learn more about our products and colleagues.