The best way to explain data governance to beginners

Willem Koenders
ZS Associates
Published in
7 min readMar 14, 2023

Data management and data governance can be hard to explain to a novice audience. They cover complex data capability areas such as metadata management, data quality, data architecture, data cataloguing, data privacy, data science and data integration. I’ve found myself struggling to quickly and definitively explain the underlying core concepts, both while advising clients and with new members on my own teams.

Over time, I found that using an analogy proved to have the highest chance of success. In this case, data management can be compared to real estate management, as both require effectively organizing, maintaining, and utilizing valuable assets. It not only helps to understand the underlying components, but also to imagine how they all operate together.

Let’s walk through each of these comparisons:

  • Data Asset: At the heart of the analogy lies the data asset, which corresponds with the building or property in real estate management. The data asset can also be perceived as a data product or a dataset. Both data and real estate management revolve around managing assets that generate value when governed and nurtured adequately, but that lead to risks and losses when mismanaged.
  • Data (Product) Ownership: A critical concept in data management is ownership — responsibilities may be delegated to others, but at the end of the day, one person or team should be the owner of the data. The same is true for a building, where this would be the property owner or landlord.
  • Data Steward: Data stewardship involves assigning responsibility for the management of data assets to specific individuals or teams, for example, to ensure that data is sufficient quality. In real estate management, data stewardship can be compared to the role of property managers who are responsible for the upkeep and maintenance of a property.
  • Data Consumers / Users: Various individuals and business processes may consume the data, internal and external to the organization. This can be compared to the tenants that use the building for their respective purposes.
  • Data Monetization: Data monetization involves leveraging data assets to generate revenue, for example, by selling data to other organizations. In real estate management, this would be equivalent to finding ways to generate income from a property, such as renting space out to tenants or for an event, selling advertising space, or selling it altogether.
  • Data Contract: A data contract is a formal agreement between a data producer and data consumer, confirming what data is to be exchanged and the corresponding formatting and quality requirements. This can be compared to a lease agreement, in which it is described what is expected of the landlord and in what state the property will be made available. It also outlined what the property can be used for (and specifically, what cannot be done to or with it) — the data contract can be used for similar purposes.
  • Value Quantification: In both cases, it is a worthwhile exercise to estimate the value associated with the asset. Just as the value of a property depends on its location, size and condition, the value of data depends on its relevance, accuracy and accessibility.
  • Data Security and Access Controls: Data security refers to the protection of data assets from unauthorized access, use or disclosure. In real estate management, data security can be compared to the use of locks, alarms and security systems to protect a property from theft or vandalism.
  • Data Architecture: This can be compared to the blueprint of a property, which defines the layout, design and construction of the building. Similarly, data architecture involves the design and structure of data storage and retrieval systems. Architecture standards can provide guidelines and best practices for how buildings are constructed, and data architecture standards do the same for data assets.
  • Data Domains: Just as a city is divided into neighborhoods, data can be divided into domains based on its subject matter. Any property belongs to a single neighborhood, and together, all neighborhoods include all properties — the same holds for data assets and domains. Each neighborhood has its own characteristics, such as demographics and property values, and similarly, each data domain has its own attributes and requirements. An organization like a Homeowners Association (equivalent to data domain owners or stewards) can be chartered to oversee that these requirements are implemented.
  • Data Policies & Standards and Regulatory Compliance: This can be compared to the different regulations that govern the use and development of properties, such as zoning laws, environmental regulations, and building and fire codes. Similarly, data policies and standards define the rules for managing data in an organization, which are derived from applicable regulations such as those related to data privacy and data protection.
  • Metadata Management: Metadata is data about the data — it can describe the data asset in terms of the data attributes it contains, who owns it, who has access, who did access it and when, its location, how many records there are, and the size of the total asset. It can be compared to detailed information about a property and its features, for example, the total square and cubic footage, the owner, the number of rooms, its location and who has keys to the building.
  • Data Quality: Data quality refers to the fitness-for-purpose of data as measured along dimensions like accuracy, completeness and consistency. In real estate management, data quality can be compared to the condition and upkeep of a property, such as whether it has any defects or safety hazards.
  • Data Remediation: Data remediation refers to the process of identifying and correcting data quality issues. In real estate management, data remediation can be compared to the process of identifying and correcting property defects, such as a leaky roof or a faulty foundation, to maintain the value and safety of a property.
  • Data Usage: This can be compared to the measurement of the usage of properties, which helps in determining their potential value. This includes occupancy rates but perhaps even more detailed logs of who entered the building, when, and for how long. Similarly, data usage measurement involves tracking and measuring how and by whom data is used in an organization, and to what extent data assets are adopted.
  • Interoperability: This can be compared to the compatibility of a property with other properties and (upstream or downstream) systems, and its ability to share common infrastructure or resources. For example, a building is connected to the electrical grid, water network and sewage system, where each of these connections comes with precisely defined standards in terms of voltage, water pressure and pipeline sizes, and sewage standards. In a similar sense, data interoperability refers to the ability of the asset to exchange data and work together seamlessly with various other systems and applications, subject to common standards.
  • Data Storage: Data storage can be compared to the physical size and foundational structure of a property. A property might have to be of a certain minimum size, for example to accommodate industrial machines or to house families of a certain size. Similarly, data storage refers to the physical or virtual storage capacity in databases, data warehouses or data lakes.
  • Data Lifecycle: This can be compared to the life cycle of a property, which involves various stages such as construction, maintenance, renovation, and demolition. Similarly, data lifecycle management involves managing data through various stages such as creation, storage, usage, archiving and disposal.
  • Data Integration: Different properties and neighborhoods are connected by roads and transportation systems. A particular building may provide easy access to public transport and a nearby highway. Data integration involves connecting data from different domains and sources, which can involve tasks such as data cleansing, data mapping and data transformation to ensure that data from different systems can be used together. Without integration, you can’t access or use the data, the same way you would not be able to enter or make use of a building.

The real estate analogy provides a helpful way to understand the various aspects of data management and how they work together to support the organization’s overall data strategy.

Let me know what you think and if you have perhaps used any other analogies?

References and recommendations for further reading

--

--

Willem Koenders
ZS Associates

Global leader in data strategy with ~12 years of experience advising leading organizations on how to leverage data to build and sustain a competitive advantage