The Data Mesh Strategy Behind Intuit’s Global Financial Technology Platform

Tristan Baker
Intuit Engineering
Published in
5 min readApr 22, 2024

Over the past several years, Intuit has built foundational capabilities for collecting, processing and transforming raw data into a connected mesh of high quality data. Those capabilities are enabling our technologists to build personalized experiences, with speed and at scale, to deliver on our mission to power prosperity around the world for approximately 100 million consumer and small business customers on our global financial technology platform with TurboTax, Credit Karma, QuickBooks and Mailchimp.

In previous posts, I’ve shared the motivation behind Intuit’s Data Mesh Strategy and provided an in-depth look at the Data Mesh Concepts behind the experiences, services and processes we’ve implemented in a small corner of our our vast data estate, which has given us the confidence to scale more broadly across Intuit. For an enterprise company of our sheer size, scope and history (40+ years!), this means managing hundreds of thousands of tables with petabytes of data and harnessing decades of information across various systems, all while continuing to expand our portfolio of products and services.

Given reader interest in this hot industry topic over the past few years, I thought it might be helpful to provide a high level overview of our data mesh journey as an introduction for anyone setting out on their own journey — and as a precursor to deeper dives I’ve written about on this platform.

The challenge — and opportunity

Back in 2021, we were facing an exponential increase in internal demand for access to high quality, real-time data across the enterprise from a broad spectrum of data workers (service engineers, UX developers, business analysts, data analysts, data scientists, machine learning engineers, etc.).

This led to a variety of challenges that will be familiar to anyone who’s spent any time in a data lake trying to make sense of structured and unstructured data to generate business value:

  • Discovering data. Teams didn’t always understand where they could find data related to the specific problem they were solving, as well as where to find data sourced from a particular product or service.
  • Understanding data. Teams didn’t necessarily know who could give them access to data, how existing data was structured or used in the business, or how data interacted with other concepts or data sets within Intuit.
  • Trusting data. It wasn’t always clear what system produced or used data and how quickly data could be delivered. Data quality wasn’t immediately clear, nor could teams always tell who would help if something broke.
  • Consuming data. It wasn’t always easy to find out who could approve data access for production systems or whether teams would be notified when data structures changed.
  • Publishing data. There was no uniform guidance for describing, hosting and maintaining data systems, so teams were unclear on how they should meet operational and compliance requirements, or whether they were duplicating data or data processes that already existed.

The solution — a distributed enterprise data mesh architecture

A survey of solutions for a company of our size and scope led us to Zhamak Dehghani’s description of a “data mesh.” The idea of shifting from a centralized data lake to a distributed enterprise data mesh architecture immediately caught our attention.

We recognized the potential to “adapt and apply the learnings of the past decade in building distributed architectures at scale, to the domain of data” with ownership and accountability given to individuals and teams. Having seen this approach work well at the “front of the house” with our microservices architecture and core transactional services, we were confident it could be successfully applied to “back of the house” data systems.

With data mesh fundamentals in mind, we set out to empower data “producers” to design, develop, fully describe and actively support their own data-driven systems, and enable data “consumers” to easily discover, understand, trust and consume the data. Altogether, our intention was to power a network effect of data production and consumption by:

  1. Systematically organizing people, code and data. We define internal processes and initial data sets that produce externally consumable data as a data product. To support discoverability, we link those products to the business problem they help solve.
  1. Clearly defining ownership of each data product. We ensure teams understand and are accountable for a defined set of responsibilities and best practices for producing high-quality data. Teams are typically made up of one or more data workers, a product manager, and a leader responsible for the timely and efficient execution and delivery of a business solution.
  2. Providing tools for designing, authoring, deploying and operating data products. We integrate these capabilities into a central, self-serve platform service with built-in adherence to security, authentication, authorization, access control policy management, change control procedures, schema registration and documentation standards. We also integrate quality, cost and performance measurement and reporting into the platform so that individual teams — and the organization at large — can understand how well a particular data product is performing.

To date, we’ve seen a 26 percent productivity improvement from our deployment in a small corner of Intuit’s large data estate, as measured by the time it takes for Intuit teams to discover, access and explore data for a new project when compared to an environment where data mesh has not yet been deployed.

Expanding the data mesh at scale across the enterprise

The results of our initial deployment has given us the confidence to scale our deployment to the entirety of Intuit to deliver massive benefits to our teams, our operations and, ultimately, our customers.

To learn more about how you could apply our learnings to your own data mesh journey, check out my Data Mesh Strategy and Data Mesh Concepts blogs. To see some of the data mesh tools and experiences in action, see this talk by my Intuit colleague, Suresh Raman.

My hope is that you’ll leverage this content in the same way my teams have — as a guide for implementing the experiences, services, and processes necessary to realize the benefits of a data mesh strategy.

I’m proud of how far we’ve come, and excited about what the future holds. Since my October 2023 Data Mesh Concepts post. There’s even a possibility that we’ll open source Intuit’s internal API definitions and service implementations to develop a standard approach for implementing a data mesh across the industry.

Special thanks to my partners at Intuit who have contributed to the insights and ideas that have shaped Intuit’s data mesh strategy: Shachar Bar David, Stephen Molloy, Yaniv Levi, Daniel Sharvit, Krithika Swaminathan, Achal Kumar, Arun Ragothaman, Narayanan Singaram, Juhi Dhingra, Rui Dai, JD Rosensweig, Adi Ohana, Allison Bellah, Samuel Knapp, Robert Mei, Timur Fayruzov, Larry Raab, Sreekanth Martha, Barry Nisly, Ron Sher, Karen Maciolek, Dunja Panic, Ashish Page, Omar Abdelmagid, Robin Oliva-Kraft, Akbar Rangara, Kaushal Sheth, Guru Prakash Narayana, Ryan Quigley, Raj Chandramohan, Denise McInerney, Aveek Misra, Prashanth Sashadri, Elmer KimNii, Jainik Vora and Saikiran Thunuguntla.

--

--

Tristan Baker
Intuit Engineering

Intuit Distinguished Engineer and Chief Architect of Intuit's Data Platform Organization. I ❤ Data.