Increase Efficiency in Your Analytics Projects

Marek Zelc
GoodData Developers
8 min readJun 26, 2024

As the Head of BI at GoodData, I work with analytics daily and am constantly looking for ways to enhance the delivery of our analytics projects. Based on my experience, here are some key strategies to increase efficiency.

Struggles in any data projects

As in any project, there are different struggles and obstacles in all the data projects you undertake. Some of those are very common, and some others come from a particular state in the organization or in the project itself.

Capacity

Let’s face it — there is not enough capacity to maintain and deliver everything required all the time. New requirements always come, priorities shift, and existing stuff breaks or changes. The data world is no exception.

Sometimes, you can increase the team’s capacity, but let’s be realistic here — it’s not that common. So your only option is to work on efficiency. Let’s check how to improve it.

Understanding the Data

To work on any data project, you must understand what data you need. The data person needs a deep understanding of the data to deliver meaningful analytics for its end users.

There are huge differences in how different data teams work and how the responsibilities are divided. I’ll mostly address smaller teams where data engineers and analysts often share responsibilities, making data comprehension even more important. Most of the time, the biggest problem is the amount of data, which gets even bigger with each new data set.

To deliver valuable analytics, the person delivering it usually has to have a very deep understanding of the data, not only on the integration level (primary/foreign keys, notable concepts — cardinalities, possible null values…) but also on the meaning level — what particular values mean, what are allowed values, and more. In many use cases, the data team has to explore the data and deal with data errors.

Specifications

In most projects, it’s difficult to get a full requirement specification. Different perspectives commonly cause this — “It’s clear to me, so it’s clear to you too, right?” But this problem is not specific to Analytics.

Complexities come from different perspectives in data projects. They can come from the data itself, the requirements, or the underlying data infrastructure.

Increasing Efficiency

It’s common sense, that when you address the difficulties, it increases the efficiency of the data team. In our case, we have tried to modernize our data stack. Our big breakthrough was to switch to the “analytics as a code” approach last year. With it, we have been able to stabilize our whole solution while allowing us to work effectively with the data at scale. It’s easier to reuse data and transformations, as well as spinning up new use cases).

With the solution in place, we have tried a new approach in a few use cases, and the results exceeded our expectations. At the same time, it’s crucial to consider that this is not a universal, cookie-cutter solution. Still, there is a good chance it may work.

Power Users

So, what is the key to our success? It’s people on the requestor’s side who are willing to get their hands dirty and work with the analytics directly! I’ll refer to those as the “Power Users” of analytics. Those who already work with some analytics or raw data, often manually. They are truly unsung heroes.

Those are the users who already understand the data and what is needed from the analytics projects. Best of all, they are willing to step out of their comfort zone and learn a few new things.

When the data team gets a new requirement, we need to prioritize and assess the expected delivery, which is always too late. But… What if we split the work and have people work effectively?

For this, I have four basic assumptions:

  • Power User knows what he wants
  • Power User knows a lot about their data
  • The Data Team knows how to work effectively with the data
  • The Data Team knows the tooling and can help the Power User learn the important part of it faster.

Sounds about right?

Efficient Work

The key to our success was using the strengths of all involved. So, let the Power User describe what he needs, the data team identify the data needed, and then all together discuss the data connections (data model). There is no need to understand every attribute along the way; let’s just know they exist. Focus on the interconnections and desired outcomes.

There is a place for the data team to do the boring stuff — preparing the data. Yes, it can be pretty time-consuming, and not all requirements can be addressed quickly, but there are plenty of cases where you just need to add a simple dataset, combine it with existing ones, and maybe add a touch here or there. Focus on delivering the basics — deliver the data into the model you agreed on; do not try to get perfection at this point. Perfection is often an illusion, sometimes even a toxic one.

The key part starts now — give the data to the Power User and let him understand the basics of the tools. If the tooling is easy enough to learn and use, the user may have a unique opportunity to get the analytics that fits his needs and is delivered more efficiently than in a regular flow.

Once the first version is out and being used, you will always find things that can be better, things that are uncomfortable to work with, and things that are missing, but with established cooperation and working experience of all involved, it is gradually easier to do further iterations.

And in real life?

Tale of Robert and Marek

Once upon a time, there was an Engineering Director named Robert. He and his team went through a shift in their work style, and Robert was looking for metrics on how well they were doing. They went for DORA metrics.

In the neighboring kingdom, Marek, the BI Team Manager and partly Data Engineer, did all the regular stuff with his team (hoarding the data, trying to clean/fix it, providing it to different people. All while trying not to get crazy!). The immediate roadmap was planned, with a lot of ongoing work.

Robert and Marek met and agreed that having DORA metrics in 2 or so months is too late. So they agreed and tried to do something about it.

And that’s where our efficiency improvement started.

What we have

Let’s move to GoodData and out of the tale. Initially, we had a pretty well-maintained data stack, and the relevant part was built by our latest blueprint, which included the “as-a-code” approach. That means Meltano extractors, Snowflake data warehouse, dbt transformations, and GoodData Cloud as the analytics tool.

We already had experience with most of the relevant data sources (Jira, GSheets, CSV extracts, etc.) and could reasonably scale the data pipeline solution.

Another key factor was the long history of cooperation between the Engineering and Data teams.

When DORA metrics came to our focus, the data team was fully utilized on existing projects, and regular project implementation was not an option for the next two months, but a brave man from the Engineering team agreed to get his hands dirty in exchange for getting the metrics faster.

The Cooperation

In the beginning, we reviewed the required results and identified the data needed for the first phases of the project. With a solid idea of what we wanted to achieve, we identified the important points in the datasets and outlined the interconnections between them. Our initial aim was to build a data model that would be easier to understand for people without a “data background.” This initial scoping took us less than three hours.

After that, the data team prepared the initial data model (with the mentioned as-a-code approach and experience with the needed data, it took us less than a day, a delay we could afford for other projects). We skipped most of the usual data exploration and understanding and focused only on the key data points.

This is where the main time savings came from. The tool connected the strengths of the Data and Engineering teams, allowing them to combine them. Yes, the tool is our GoodData Cloud. It allows the Data team to effectively connect the data into it (“as-a-code” — not a single person on the data team touched GoodData by hand) and, more importantly, allows anyone to use the data within an intuitive UI.

Of course, we needed to introduce Robert to the GoodData world as he was not an “editor” user then, but this approximately one-hour investment (plus a few small consultations later) was very good, as it saved us hours if not days. Robert was able to quickly start working with the data and prepared the first dashboards in a few hours, and this is the second point where we saved effort — Robert already knew the data and did not need to spend extra time exploring it.

As in every similar data project, there were data issues — gaps, errors… With Robert on board, it was much faster to fix them than usual. In most cases, we just validated what caused the resulting analytics to not provide the required result, and then Robert himself could fix the data (filling in missed fields in Jira and repairing wrongly entered bug attributes…).

The above resulted in usable dashboards being delivered within a week. Similarly, we completed the second phase (more granular data about delivery pipeline jobs) afterward.

And one small thing — despite our BI environment being fully “as-a-code,” Robert did not write a single line! GoodData just allows us to get the code. Another benefit of our setup is having an extra environment for working on changes without disturbing production.

Since then, we have repeated the process with several other use cases, reaching similar results.

Part of the result dashboard

Conclusion

The described approach allowed us to increase the number of our data projects. Of course, not all projects were delivered in this manner for various reasons. But those that we did deliver this way, we did so with fewer resources and much faster.

There are 2 key components for success here — people willing to go the extra mile (by learning new stuff) and tools that allow them to work effectively. Without one or the other, this approach simply won’t work.

Want to learn more?

If you are interested in the Analytics as Code approach, I recommend you read about What is Analytics as Code. We also have an article dedicated to the journey of our Data Pipeline as Code.

If you wish to try the Analytics as Code approach, you can use our free trial!

--

--