VALIDIO
Published in

VALIDIO

Image source Simon Fraser University

The rise and future of data engineering — what’s it all about?

I still remember how not that many years ago it was common for people to refer to a data scientist as a “unicorn”. The expectation level was set at a super full-stack engineer/mathematician who could understand all business problems. However, during the past 2 years, as we’ve passed the peak AI/ML hype, we’ve witnessed the rapid rise of the data engineer. Dice’s 2020 tech jobs report cites data engineering as the fastest-growing job in tech in 2020, increasing by a staggering 50%.

Data Engineer is the fastest-growing job in tech (Source: Dice 2020 Tech Job Report)

The rise and evolution of the data engineer

Today, cloud data warehouses (Snowflake, Amazon Redshift and Google BigQuery) and lakehouses (Databricks) provide the ability to store massive amounts of data in a way that’s useful, not completely cost-prohibitive and doesn’t require an army of very technical people to maintain. In other words, after all these years, it is now finally possible to store and process Big Data.

Everything is trending towards a bright future for data engineering

The next evolution of the data engineer

But what is data engineering today?

Data engineering today = data pipelines?

Data engineers come in different shapes and colors

Data Science Hierarchy of Needs (Source: Monica Rogati)

A real-world example from a fast-growing scaleup with a modern data team

A high-level view of how the three roles have similarities and differences in their focus (image source Oda)

Being a data engineer = shiny new tools?

Example Data Stacks (simplified) we often see at Validio

Does data engineering exist for the sake of data science?

Image source Marijn Markus

Final thoughts

I want to disclaim that I’m not a data engineer myself. This post and the observations made are based on numerous discussions I’ve had with data teams — from fast-growing startups and scaleups to large publicly traded companies.

--

--

Batch or streaming pipelines, stop firefighting data failures with Validio. Next generation data quality quality platform for modern data teams.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Oliver Molander

Co-founder at Validio and early-stage tech investor at J12 Ventures. Preaching about the realities & possibilities of Data & ML.