Accountability first in data management solutions

Yael Rafalovich
Wix Engineering
Published in
3 min readJul 31, 2022

In the past, data management solutions were directly connected to the tech stack of your choosing. Each Data Warehouse / DataLake would have its own management solutions, best suited for it. For example, Active directory for permissions in MS platforms, or Google Groups for Google. Those were commonly managed by the DBA or the IT department.

The new data world

Today it seems every big data company chooses to diversify its data stack using multiple storage solutions and enjoying the different benefits each one brings to the table. At Wix.com we took it to the extreme with both S3 (AWS), MS SQL on prem, GDS, SnowFlake… But this diversity comes at a cost.

First, we needed to find computing solutions to fit all this stack. We started with Presto, which became Trino, and now the data infra supports Spark as well (running over EMR). To top it all, Wix data organization adopted the data mesh approach, making it easy for different data teams to use all this wealth.

Then we needed to find an orchestrator for batch processing — to manage all our data pipelines. Eventually we chose Airflow, but this is a story for another blog post.

High level data Stack, now a days big data companies tend to have multiple solutions for each component

The data management era

The need for cross domains and technical data governance solutions soon came on to the scene. But it seemed the data management world stayed behind.

Over the last few years, you could start feeling the wind of change. More and more solutions started to pop up. From access control tools, to data quality and data discovery solutions. It seems this field is on steroids with new solutions popping up like mushrooms after the rain.

It is hard to catch up with all the new solutions and approaches out there.

So where should you start from?

In general, it differs from biz to biz. In the end, any data platform should align with data security and legal needs. If data is your business (adtech, marketing solutions, social…) maybe quality and reliability should be your north star.

But what happens when you find such a legal/security/quality breach? How and who would handle them? Especially in the data mesh ecosystem?

Accountability first

So all data platform teams have limited capacity. You can’t implement it all at once.

At WIX we had alerts related to our data pipelines before we had all the data management mambo jumbo in mind. We discovered that we had no idea who should handle those issues. We also found our documentation on how to solve such common issues constantly out of date. It resulted in slow response to incidents and wrong decision making.

Later on, we started working on the semantic layer for our reporting tools and data automation processes (automating standard repetitive ETLs). Those brought new entities to live that needed to be accounted for. Data consumers had questions but didn’t know who to address them to. We started to lose the trust of our internal users.

Then came a cost reduction effort alongside the cost attribution project and the need for a clear owner of each data asset appeared again. Accountability needs kept popping up on a daily basis.

That is our data management ecosystem in its early days. We still have a lot to accomplish: data catalog to hold all our data assets, clear data lineage, representation of quality score, data health observatory and much more. New ways to look at data quality and management emerge all the time. But we understood implementing those solutions was pointless if we didn’t implement an accountability model into them.

So we started with what we have, defining ownership to our existing platform and tools.

So what is accountability? What are the different approaches we considered? What did we choose at the end?

All that and more in my next blog post:

Data accountability at Wix

--

--

Yael Rafalovich
Wix Engineering

Data Rules! with out it u r just another person with an opinion. Leading Data Management efforts at Wix.com