Federated Data Governance does not equal decentralized, distributed, hybrid.
The federated solar system of Data Governance
We have been talking about federated Data Governance for a while, definitively since Zhamak Dehghani introduced Data Mesh.
Yet, even though federated Data Governance has been pointed out as an integral part of making Data Mesh work, as an integral part in understanding business problems better, an integral part of guiding decisions about data handling close to the data itself, I still feel it is misunderstood.
There are surely several reasons for this, but let me start with the following observation, that for everyone who knows me does not come as a surprise:
Federated can mean different things to different people.
1. Federated means different things in different fields
My personal reference point to federated is tightly linked my political science background, which provides a different perspective, then eg. from a Machine Learning perspective.
What I mean when I say federated is the following:
Federation constitutes at vertical devision of powers, establishing of at least one additional level of non- or partly sovereign government. Sovereignty is the authority and power of a state to govern itself or another entity without external interference.
There are two main types of sovereignty:
- Internal Sovereignty: The power of a state to rule over its own territory and population.
- External Sovereignty: The recognition of a state’s independence and autonomy by other states.
The guiding principle adopted by a number of federated states and systems to ensure effectiveness and efficiency in regulating within the federation is the principle of Subsidiarity — the idea that decisions should be made at the lowest level possible while still maintaining alignment with overarching objectives.
Now, what does that have to do with Data Governance?
Data Governance is very much the arena to ensure structures for managing data, be it policy, roles, responsibilities, but also control and authority.
“Data Governance is the exercise of authority, control and shares-decision making (planning, monitoring and enforcement) over the management of data assets.” DAMA DMBoK
So in many ways Data Governance is quite comparable to political entities, and thereby it feels natural to take inspiration from political system theory. Yet when we talk about operating systems, we are talking about execution, administration and thereby Data Management.
2. Use of “synonyms” for federated
Now, this is something that I have encountered often, and I am guilty of doing the same: We use related words as “synonyms” for federated.
Here are the typical 3 candidates, that can lead to confusion, decentralized, distributed, hybrid:
Decentralized
I wrote about this in an earlier article: https://medium.com/@winfried.etzel/data-governance-in-a-distributed-landscape-embrace-the-domain-data-teams-fe0bea02f14a
The main difference is that federated teams have central dependancies, a covenant that manifests the split of responsibilities, but also autonomy for their field of expertise.
Decentralized teams have full responsibility AND accountability of their work. Yet this can be centralized when needed, which can create a consistent pull-push situation, with difficult communication that can make it hard to sustain over time.
Distributed
The term distributed can have different meanings in different fields (see 1), yet the main source of inspiration for “distributed” in data comes from Cloud computing and Blockchain technology, that points towards distributing processing loads to a network of nodes.
When talking about distributed in data, it is often defined as a logical step away from central bottlenecks. Distributed teams in that sense are self-organized network-like structures. This can be a great implementation pattern for Data Management in highly accountable settings, where autonomy can be prioritized over alignment.
Yet this can easily lead to duplicated ownership, difficult coordination, and lack of consensus.
Hybrid
A hybrid model intends to balance central accountability with decentral flexibility, and can be quite similar to a federated model.
The benefits of clear accountability in a centralized structure, provide a formality and structure, also for communicating within the model. At the same time decision-making can be pushed towards working groups and entities that work towards consensus building rather than enforcement from central hold.
This might be the term that is most often used synonymous to federated. Because both are “in-between” centralized and decentralized approaches, yet their origin is different. The main difference in working models to a federated appraoch is connectivity and formality between bodies.
3. Use of federated as a term itself
The term federated itself can bear different meaning to different people, depending on your context and perspective.
Do you consider federated as a term that indicates the existence of a higher body of authority that can overrule your decision?
Or do you see federation as a why to take ownership and gain certain autonomy to make decision on your level of authority?
Dependent on your context both statements can convey negative or positive connotations. To not tailor for this in your communication when evaluating or implementing a federated model can easily lead to resistance.
The stars align
From my perspective on federated Data Governance, I have earlier pushed an idea of a solar system approach:
Data Governance is a natural gravitational field that binds domain-driven decision making to the technical platform capabilities on the one side (here is your connection to computational and Data Contracts) and policy design for elevated, global decisions including oversight on the other side.
To ensure that this model can work, you need to find a way to create that gravitational pull from a Central Data Team. There needs to be a natural interest from the Domain Data Teams to rotate towards the center, while being able to maintain their orbit. A domain agnostic platform can create that pull, but also a mutual understanding of shared interests in alignment.
Conclusion
Federated data governance involves a balance of shared authority between central oversight and localized decision-making, much like in political systems. While related terms such as decentralized, distributed, and hybrid are often used interchangeably, they each imply distinct structures and roles in managing data. It is important to understand the nuances and perspectives on federation to implementing effective data governance. Remember that each organization, each individual can imply different means to a term like federated.
Maybe we can view our Data Governance organization in a federated model as a structured way to creating a gravitational pull towards alignment, whilst providing autonomy for Domain Data Teams within their orbit.