Scaling up analytics in Pipedrive — toward the data mesh

Björn Dahl
Pipedrive R&D Blog
Published in
6 min readSep 20, 2022
Image by Gerd Altmann from Pixabay

Scaling up analytics in a maturing company brings with it certain challenges. Providing answers to analytical questions over all the various specific business domains will quickly push the limits of a single, centrally provided analytical service and requests for a change in mindset — in order to remain scalable.

This article is about how we shift analytics from a central model to a more spread-out approach at Pipedrive, and what all this has to do with data mesh.

Data mesh is about data domains

What is data mesh?

There’s been talk about data mesh for a while now. Although the concept is far from ready for a full industry-wide roll out, the general problem statement is gaining relevance at many companies.

Data mesh shows how the current mindset and de facto standard to building data analytics systems is very much based on central teams, using central tools and relying on centralized data engineering. It’s “technology-first” thinking. This approach creates data monoliths and competence silos that don’t scale.

What data mesh proposes, instead, is a shift to “domain-first” thinking. “Data domain” means the digital assets within a business area. The boundaries between different domains are often reflected in its organizational structure (i.e., departments). Setting and keeping the domain front and center means that analytical business problems and solutions designed to resolve them would never be separated from where they originate – the business itself. With the right kind of support from the organizational structure, business teams can remain close to the whole data lifecycle and execute data ownership in the most effective way possible.

Find more information on data mesh here: https://martinfowler.com/articles/data-monolith-to-mesh.html.

Technology or domain first?

There are good reasons why we haven’t always operated “domain-first” throughout the past decade’s evolution of analytical and data warehousing solutions. It has to do with specific technical and tooling limitations which demand certain technical competences.

Data pipelines have required — and still do, to a large extent — specific technical work with fairly low-level tools. For the data flow to run smoothly from instrumentation to collection, cleansing, standardization and modeling, and to analytical outputs like reports and dashboards, we need data engineers with very technical skill profiles. A lot of this work we call software engineering.

Read more about this challenge in our data engineering blog post: https://medium.com/pipedrive-engineering/data-engineering-the-unsung-hero-e83692dc88b5

Image by Peace with Love from Pixabay

The future is here — at least a slice of it

The data tooling industry isn’t yet ready to hand business teams the full responsibility for domain data flow management end to end, from data sources through storing, transformations and until analytical outputs. Most of these tools are too complex and the requirements for technical skills simply too high.

However, we’re at an evolutionary point in the industry. Nowadays, we can — and, therefore, should — hand to business teams the responsibility of analytical work like reports and dashboards that make up thelast mile” of a data analytics service.

Although today’s reporting and dashboarding tools require data literacy and other related skills, this isn’t out of reach for a competent data analyst. “Data analyst” in our context means someone well-aligned and aware of the business challenge that needs to be resolved, but at the same time also capable of working with data models and queries, and shaping the results in a visual reporting tool like Tableau.

Having dedicated data analysts working directly within business units, close to actual business challenges, makes sure data domains get the attention they need and keeps domain ownership in the right place.

Is our current approach sufficient?

The central data service model still works in some areas

At Pipedrive, our data analysts used to be part of a central data team, along with all other roles related to data pipelines, tools and running regular data loading flows. That meant everything from sources to destinations like reports and dashboards.

Making the organizational change towards more mesh and less central can seem like a big deal. Therefore, there must be good argumentation supporting it.

On one hand, the same old central data service model mostly still works. This is especially true for “lower levels” of data flow management like data ingestion, cleansing and standardization — even more so if these can be resolved through automations.

On the other hand, central teams might have to grow two- or three-fold to keep up with the demand for organizational data and analytics. This would need careful organizational planning.

Don’t ignore the signals

Are we providing the needed analytical outputs fast enough and at the volume requested by business teams? Do we understand the questions people are asking? Do we take good ownership of the metrics and dashboards getting built and support and maintain them in the future? If the answer is a definite “no”, it’s time to review the service from a scalability point of view. If increasing the number of data analysts at the central service doesn’t improve output quality fast enough, it’s time to consider an organizational change.

Image by MetsikGarden from Pixabay

Rolling out changes

You will ship your org chart

Any change at a larger organization must be supported by how the organization functions internally. No matter how agile the mindset is of those involved, the department and team boundaries, budgets, hiring processes and lines of communication will come into play and shape results. It’s also worth keeping in mind that in the long run, the budget owner always dictates everyday priorities. If the goal is to have a team of analysts in each business department, these changes have to make sense in the actual department structures and be agreed with the leaders of each internal domain.

Leadership support is key

The pain points acknowledged by senior management are a good place to start. This will make sure any fluctuations caused by the change are seen in the right light and follow a shared vision of a brighter future. Clear communication and weighing the pros and cons of different solution scenarios helps generate wider support to overcome potential blocks during the rollout process.

New is better, but it’s not magic

A decentralized setup brings its own complications. One challenge with distributed data analyst teams is having the adequate support to hone skills, establish best practices and strengthen cooperation. For us, the solution was to establish a new analytics guild structure. This helps tie up loose ends from organizational change and functions as the central forum for analytical work. We also face several other company-wide challenges like data governance, standardizing tools and strategic planning of analytics. All these we plan to tackle with the help of the guild structure.

We envisioned the analytics guild as a voluntary group. In order to support this, there had to be an effective arrangement that would allow every analyst the time for central cooperation and company-level challenges. For this purpose, a clear agreement was made within the executive team that up to 20% of every analyst’s time and attention could be devoted to activities initiated by the guild.

A smooth transition from “old” to “new”

At Pipedrive, we opted for a few months’ transition period from existing team structures — data analysts were located and hired into a central data team, to the new setting — where each department is free to form its own domain dedicated analytical teams according to its needs. Also, the existing analysts team was slowly phased out, with enough time for team members to find suitable new locations in the organization to keep working in a happy and productive manner.

Conclusion

In the long run, I believe in the data mesh approach — prioritizing data domain over technical and organizational issues. There are areas where we at Pipedrive can already apply it, like data analytics. Your mileage may vary, depending on the business and technical complexities of the data challenge.

Shifting from a central analytical team to decentralized teams is the way we at Pipedrive scale up analytics, enabling business teams to extract the insights they need and grow their business.

Interested in working in Pipedrive?

We’re currently hiring for positions in several countries/cities.

Take a look and see if something suits you

Positions include:

  • Front-End, Back-End and Lead Engineers
  • Junior Site Reliability Engineer
  • Principal Solutions Architect
  • React Native Developer
  • And several more…

--

--