How To Choose And Plan Your First Data Product?

Orya Roseman
Inside Bizzabo
Published in
4 min readNov 27, 2022

Our second blog article will focus on our process for choosing and planning the development of our first data product. Once we identified our first domain, we started working on analyzing the domain business questions and data products that would provide them. In parallel, we spread the #datamesh culture principles and purpose within the organization.

Throughout our first post, we stressed the importance of designing data products with a specific goal that facilitates value. With that idea in mind, we looked for the first high-value data product, which is straightforward, easily understood, and simple.

Data Products Require Data Contracts

When we started using the #datamesh methodology, our first rule was that there might be unknowns. For example, we might have only some process steps or the best data catalog tool. That was a non-issue as long as we iterated.

That said, we started sketching the first version of a #datacontract following a template we made to generalize the process.

But First, What Is A Data Contract?

The purpose of data contracts is to facilitate and promote data sharing. Data contracts are the external and observable representations of the data products, and they should communicate the underlying business semantics of the data products to potential consumers.

Although the data contract might be a new term, the information it holds is not unique to engineering teams. When you think about it, contracts were here long before the #datamesh. Consider different service patterns like microservices, open API, swagger & messaging (ex: Protocol Buffers) are all instances of contracts.

What Data Should Go Into The First Data Product

While deciding on the first data product, we knew we had two choices, both of which required creating a data product. One was to design a new data request that was not available in the past on the domain side, the second was to take a known piece of data and design it to be developed under the data mesh principles. The second option was chosen.

The domain we chose treats its operational and analytic data outside our main systems. Even more, it will be true to say we are treating it as an application connected to our architecture through an API. That fact was one of the reasons we first chose this domain because it acts as a real autonomy and will be able to understand and operate under the decentralized guidelines that are part of the 4 data mesh principles.

Following our choice of data product, the next step was to learn the current domain data and analytics that are embedded inside our platform and, through that process, understand the data insights the domain deals with and go back to the operational data that feeds it.

We asked the team to share the objects at the heart of their analytics. We went back to the process we created and asked the product data analyst to take the existing data structure and create a data product that will cover all of our customer’s needs for data.

The work seemed pretty straightforward: the product data analyst gathered several important entities into one prominent flat structure while keeping high-quality standards from completeness and uniqueness perspectives. What was very interesting is that when we discussed this new data product with the domain team, we discovered a new requirement that they are dealing with, a need for a time series graphs that are exposed to our customers in a real-time dashboard visualization. We understood a real need in front of us that we could leverage into our new process. At that point, the domain had all the information it needed to design data products that could be created in a decentralized methodology and be our first Google Big Query data products, with two or more new data contracts available.

Communicating, Communicating, Communicating

While selecting domains and developing data product schemas, we have established an organizational communication plan. To align everyone (who’s everyone? Keep on reading) on the new approach, we have created an educational presentation based on the different product organization groups, product, engineering, and analysts.

A decentralized approach to data management requires organizational effort. The change is surgical and needs to be built on a solid infrastructure. It touches many stakeholders, just like core strings needing replacement. On top of the communicational decks, we have created a shareable data mesh framework document, a detailed explanation of the sentiment, new terms like data product and contract, and specified process steps, indicating each role and its responsibilities.

We then started to engage the organization through cascading knowledge sharing, understanding there has to be a product, engineering, and data analysis buy-in for the change. All groups got a high-level explanation of what they should expect.

Parallel to internal communication, we have established a federated team out of this change’s leading drivers. The team is responsible for establishing new processes and tools and spreading the word across the organization.

We started to create the buzz around #datamesh, by presenting the methodology to each and every group, according to their role and responsibilities in the company and the process. We knew our main challenge would be the mind shift that is so hard to achieve.

What is very important is to remember that changes take time. Be prepared to have some challenging discussions and make some tough decisions. Domain boundaries can sometimes be blurry, and there is likely to be discussion around which data are best provisioned from where. Similarly, domain boundaries will almost certainly not match the team responsibilities or organizational structures, and nor will they necessarily map neatly onto systems or existing APIs (etc.).

At this point, discussions will gravitate to hybrid architecture or other compromises to preserve some of the existing data platforms. There is no easy answer, but all must fully engage in data mesh initiatives to arbitrate on critical decisions.

Want to learn more about our journey towards #datamesh? Wait for our next post.

--

--