Brainly’s Data Mesh Journey

Pedro Mir
Brainly Technology Blog
5 min readFeb 7, 2023

--

Building a Data Mesh on Snowflake

At Brainly, we have begun an exciting journey to implement a Data Mesh. This architecture is based on the principles of breaking down monolithic data systems into smaller, independently managed, and governed data services. It promises to bring many benefits to our data management operations, including allowing teams to have control over the data in their specific domain, which enables faster and more effective decision-making regarding data quality and usage.

It is a cultural change for the company, not just a technological change, primarily due to the concept of ownership that the Data Mesh preaches.

In this text, I will not be discussing the characteristics of Data Mesh, otherwise, I prefer to focus on the specific challenges that we are encountering in its implementation. If you want to know more about the features of Data Mesh, I recommend consulting the original whitepaper by Zhamak Dehghani, which can be found at this link.

In any case, let me summarize for you in one sentence, how we interpret Data Mesh and its four principles.

A data mesh is a technical and cultural approach to building a decentralized architecture that organizes data by a specific business domain providing more ownership to the data producers.

  1. Domain-oriented decentralized data ownership.
  2. Data as a Product.
  3. Self-Service Platform.
  4. Federated computing costs and governance.

Now let’s talk about the difficulties we have faced while implementing a Data Mesh.

One of the main challenges we encountered while implementing a Data Mesh on Snowflake was raising awareness about the importance of data knowledge in specific domains. A Data Mesh requires a deep understanding of the data, not only from a technical perspective but also from a business perspective. Without this understanding, data products will not have sufficient quality, and teams will not be able to fully utilize the autonomy provided by the Data Mesh architecture.

Another challenge has been the change in the development process. With a Data Mesh, teams are responsible for their own data services and must develop and maintain them independently. This requires a shift in the development process, from a centralized and monolithic approach to a decentralized and microservices approach.

It’s important to note that different domains may mature at different speeds, depending on the expertise of the people in that specific domain and their business needs. This means that some teams may require more guidance and support than others in order to fully adopt the Data Mesh architecture. It’s important to have a flexible approach and to adapt to the specific needs of each domain.

A single platform across all the domains (One platform to rule them all)

A crucial aspect to consider when implementing a Data Mesh is the importance of using a single platform across all domains. In our case, we chose Snowflake. Having a single platform for all domains can help to avoid the creation of silos of information, which can make it more difficult to share data. It’s important that all teams within the organization use the same platform in order to create a more cohesive and integrated data architecture.

When each domain uses different tools and technologies, it can create confusion and inefficiencies. Teams may have different levels of expertise in the different tools they are using, and it can be difficult to share data and knowledge across domains. Additionally, it can create problems when trying to integrate data from different sources or perform cross-domain analysis.

Having a single platform for all domains also makes it easier to implement data governance and security measures. With a single platform, it’s easier to ensure that data is properly managed and controlled, and that data security is maintained across all domains.

Furthermore, using the same platform for all domains can help to create a consistent and standardized data architecture. This can make it easier to create data products that are consistent in terms of quality and functionality, and that can be easily integrated with other data products.

Some other benefits of using Snowflake for a Data Mesh.

  • Snowflake’s scalability allows for an increased number of data services to be supported, without having to worry about resource allocation.
  • Snowflake’s flexibility allows teams to integrate easily with data analysis and BI tools, which can lead to the creation of high-quality data products.
  • Snowflake’s ease of use can make it easier for teams to develop and maintain their own data services independently, which is in line with the principles of a Data Mesh.
  • Snowflake’s ability to handle a large amount of data can allow teams to process and analyze large datasets, making it a good option for data-intensive domains.

Governance (Building a Data Mesh, not a Data Mess)

Another important aspect to consider when implementing a Data Mesh is data governance. Without proper data governance, it is easy to end up with a “Data Mess” rather than a Data Mesh. In our case, we are using the Atlan tool for data governance, but it is important to have some type of support in this area in order to ensure that data is properly managed and controlled. Don’t miss out on Brainly’s in-depth article on data governance.

In summary, implementing a Data Mesh on top of Snowflake is a challenge, but it also represents an opportunity to improve the quality of data products and data knowledge within the organization. By providing autonomy and ownership to different teams, we can create a more efficient and effective way of working with data. However, it is important to be aware of the challenges that come with implementing a Data Mesh, such as raising awareness of the importance of data knowledge in specific domains and the need for proper data governance. Additionally, building a Data Mesh on Snowflake provides several benefits such as scalability, flexibility, ease of use, and its ability to handle large amounts of data. By using Snowflake as the base technology, we can create a more robust and efficient data architecture that can support the growth of the company.

--

--

Pedro Mir
Brainly Technology Blog

I'm a Business Intelligence and Data Analytics enthusiast, with a focus on data architectures.