Collaboration over Conformance

Steve Jones
Collaborative Data Ecosystems
4 min readDec 1, 2021

The two worst hangovers from Data Warehousing are “only one copy” and “we have a magic schema, though shalt comply”. In a world of collaborative data ecosystems the former is damaging, but the latter could potentially be fatal for a business. McKinsey have said they think it will be 30% of global revenue by 2025 and Capgemini have found that today companies that collaborate externally on data are already out competing those that don’t.

Capgemini Research findings on the impact of data collaboration

When we start looking at the principles of business control for data, data culture, and the importance of data latency then we see how “conformance” is not an approach that can possibly work in today’s collaborative world. Does this mean there is no need to agree on things? Of course not, but the key question is where and how you agree.

So I’m going to use how SAP does things as the part of the way to explain it. SAP have built a schema for their ERP, its a big schema, its got attributes for everything. They then replicate that into their analytical environment, its got a schema, its pretty big, hundreds of tables, tens of thousands of attributes, all of which SAP have carefully named, carefully defined, and put in place standard processing and calculations for. A ‘traditional’ Enterprise data management approach might, and I’ve been there when someone suggested it, that all of these fields should have an equivalent in the corporate data model. This is insanity, a total waste of effort. If I’m taking all that data to support machine learning, then the data science teams will be be doing their own conformance and for the majority of the business they couldn’t care less about the 100th attribute on a table, the only people who might are the people who are operationally using it.

So where do we conform?

We conform where we collaborate.

Who conforms?

The business who controls the data

The point here is there are two levels of conformance, firstly the conformance that we do within an area operationally, and secondly the conformance for collaboration. What this model also says is that beyond elements like Master Data cross-references, which we need to link different data sets, is that a business area sets the naming conventions and standardization format.

Hierarchy of Data Conformance (MR = Management Reporting)

Areas of the business need to give the requirements of what they need for collaboration, and may even need to provide the funding for that conformance if they get the benefits. These are internal dollars buts it is still important to ensure that business areas that need to invest the most in conformance and quality aren’t penalized.

“Oh no” I hear old-school people scream “what about our naming conventions?”, sure have recommendations, have coaches, have KPIs that reward things being sensible. But what ability do you have to enforce those rules on external companies? How are you enforcing those rules on Social Media data today? Facing the reality of collaboration prepares the organization for the challenge of the future, where central teams and massive spreadsheets move from “not working for the business” to “an existential threat to the company”.

This doesn’t mean however that a team defines an internal view for collaboration. They need to think of their data in the context of their consumers, in other words, while the operational view is internal to finance, the collaborative view needs to be designed for people to consume. This isn’t a new concept in business, the current process models of an organization are designed to interact, so the sales team can send pieces to the shipping team. Interfaces should be designed for consumers. This means the data product owners, and the data product designers who operate at the collaborative levels need to think about what will make their data used. Whether internal or external, this mindset of separating operational views which are for “me” from collaborative views which are for “you” is crucial in ensuring collaborative working.

In the same way as API designers have had to think about consumers, but really importantly haven’t had to think about some global ‘compliance’ to a mega-schema, so data designers will have to do the same. The only thing we need to agree on is the master data cross reference, because its that which really enables collaboration.

--

--

Steve Jones
Collaborative Data Ecosystems

My job is to make exciting technology dull, because dull means it works. All opinions my own.