Analytics Engineering vs. Data Engineering
With the explosion of new tools and technologies that make collecting, transforming, and analyzing data much easier, we have now entered a new era for the capabilities of a modern data team. With that new roles have emerged and others have shifted focus. In this article, we’ll highlight the Modern Data Stack that has enabled these new roles for data professionals, we’ll dive into the newest role, the Analytics Engineer, and explain where that leaves the Data Engineer.
The Modern Data Stack
Compared to other fields like software engineering, for example, the data space is relatively nascent. So, it should come as no surprise that it’s changing fast. The data tooling landscape is evolving at what feels like breakneck speed, and this has a trickle down effect on the organization of data teams.
New technologies are automating, or at least streamlining, much of the data engineering work that used to bog us all down. These innovations catalyzed the creation of new roles and overhauled old ones.
The tools driving the organizational changes are the usual suspects that make up what we’ve come to call the modern data stack, and they fit into a few major categories:
- Data warehouse/data lake
- Event streaming
- Data transformation
- Data visualization and BI Tools
Most of these tools are pretty straightforward to set up, and they often don’t require deep technical knowledge. They’re also really reliable and pretty much just work. Gone are the days requiring a full time person just to manage and operate one tool.
These foundational technologies have also given rise to new tools that are coming out and leverage the Modern Data Stack like:
- Reverse ETL
- Data Quality Monitoring and Observability
- Headless BI
What is Analytics Engineering?
Analytics Engineering refers to the role that builds clean and flexible data models for use downstream. These models are used primarily for four use cases which I’ll list below along with an example of the associated tool they might end up in:
- Business intelligence — Looker
- Reverse ETL — RudderStack
- Machine learning models — Continual
- Exploratory notebooks — Hex
Analytics engineers are also responsible for ensuring software engineering best practices are followed, like testing, continuous integration, and version control.
Examples of what Analytics Engineering should take on:
- Building and testing dbt models for downstream use
- Refactoring existing dbt models due to changing business requirements
- Working with Data Engineering and other upstream team members to ensure that data is delivered to them in a format that allows them to build the cleanest and most insightful data models
- Working with downstream stakeholders and end users, including business users and data science, to understand their requirements to ensure the models being created service their needs
If you want to learn more about analytics engineering, What is Analytics Engineering is a must read from the team at dbt.
Why Analytics Engineering?
The key value proposition of the Analytics Engineering role is in the creation of clean, scalable, and reusable data models. These models can be used to drive business decisions with the highest value, but building these models requires a unique skillset. The analytics engineer must understand all aspects of the business and possess necessary technical skills. They must understand the underlying raw data and processes on that data that makes the business run.
Before dbt changed everything for data transformation, Data Engineering was focused on ingesting data and transforming it in transit. This left the Data Analyst to further transform the data for their needs, leading to inefficient and non-scalable solutions like Looker PDTs or huge amounts of SQL underlying dashboards in tools like Tableau or Chart.io. With datasets that are built and tested by an Analytics Engineer, the SQL that must be run by your BI tool becomes much less complicated and also means less chance for consumers of the data to get the wrong answers.
What is Data Engineering?
The rise of Analytics Engineering changed the role of Data Engineering from what it used to be.
Before the modern data stack, data engineers had to spend most of their time just ensuring data ingestion pipelines were running, dealing with database administration, and chasing down errors with queries. Now that the majority of these problems have been taken care of with off-the-shelf tools, data engineering has entered a new phase. Data Engineers are now free from the shackles of just keeping a system running, so they can add more value in other areas.Some examples of the new responsibilities now under the purview of data engineering are:
Building custom data pipelines from internal services — Due to the custom nature and usually high volume of internal data pipelines that are used to power a company’s product(s) it is the role of the data engineer to build a data pipeline that can pipe this data to the data warehouse.
Managing and optimizing data infrastructure — While there are lots of tools that make our lives in the data world easier, they still need to be managed, maintained, and optimized.
- ETL tools need to be set up and maintained.
- CI/CD pipelines need to be created to ensure that any transformation or other pipelines are operating correctly and changes are vetted before being deployed
- Data pipeline failures need to be handled — even in the best of cases, there can always be issues that can cause outages for any of these services. Data engineers are best equipped to understand and fix these issues.
Optimizing queries — Data engineers should have a deep understanding of the technologies they’re using, especially the data warehouse, that gives them insight into optimizing queries to make most efficient use of them.
A bright future in harmony
By harnessing the power of both Analytics Engineering and Data Engineering roles at your organization, you get a whole that is more than the sum of its parts. When the team is working optimally, they can support each other where Data Engineering is ensuring the quality and timeliness of raw data coming into the Data Warehouse and Analytics Engineering takes that and builds the models that will power the business insights.
With the combined powers of Analytics Engineering and Data Engineering, you find yourself in data nirvana with a full understanding of the data coming and going from the data warehouse. Together, Data Engineers and Analytics Engineers can not only leverage that data for maximum business value, but also suggest what new data should be implemented or extracted to create even more value for the business. The Analytics Engineering, Data Engineering combo enables companies to develop, iterate, analyze, and take action faster than ever before.
We all stand on the shoulders of giants, and the modern data stack has delivered so much power to all of us who work with data. Let’s continue innovating and building to show how much farther we can go 🚀