In Data Collaboration the Names don’t matter

Steve Jones
Collaborative Data Ecosystems
7 min readFeb 16, 2022

In the world of Data Governance we waste huge amounts of time on arguing over names for things, rather than concentrating on identifying the things themselves. We don’t have to agree on the name for a think to collaborate.

Many moons ago I worked on some multi-country programs. On one occasion the representatives from one country came in and had but one requirement:

“We don’t want to use any of the colours that <Country X> used”

After getting over the initial shock that this sentence had been said out loud, it presented us with a problem, as we’d done a lot of research on the right colours to use. The colours were right, this wasn’t an abstract “it looks nice” it was more “huge amount of research into the minimum colour set to convey the most amount of information, particularly highlighting what was critical.”.

Now colours had two components, the first was an RGB value, the second was a name. Now these were X-Windows colours which meant we had a nice file that mapped colours to names. So a plan was hatched. We would just rename the colours, and show that we used TOTALLY different names (except for Black and White) than the other country had. We even got feedback on the colours, show exactly the same RGB value but on different backgrounds and getting them to pick one.

And thus it came to pass that a particular shade of blue was formally dubbed “Avocado Blue”, and at no stage did anyone ever point out that Avocados are green, not blue. Everyone was happy, even referring to the ‘fact’ that Avocado Blue was a much better colour for that context than the silly colour used by the other country.

Picture of an Avocado, it is not blue
Avocados aren’t blue, except when you say they are

So two different implementations used the same RGB value, but referred to it via a different name.

The reason I relating this story is that through my career I’ve found occasions where people can agree that the ‘thing’ is a ‘thing’ but disagree on its name. I’ve been in very long and pointless arguments between Sales and Finance teams arguing whether a “Customer” is someone you could contact, have contacted, have sold to or has paid their invoice. Everyone agrees that this is a real thing, but they mean something different in different contexts.

And then even if everyone does ‘agree’ we find that internally within their area they just carry on using the term they want, only translating to the term ‘at the boundaries’. Indeed I’ve seen data warehouses where the “enterprise” model used “prospect” and the sales team created their own view on that where “customer” was a union of prospect and customer.

Another example. The debt collection part of a company was looking for “high value” customers, that was customers with a large amount of outstanding debt but also who would be able to pay if ‘reminded’. So they did the analytics, created the file, even created a process to update the file in the data lake every day. Then along came marketing, found the “high value customer” file in the data lake and did a campaign to them, which included extending their credit limit and payment terms. Now that file is actually valuable for marketing, but now it has a new name “Marketing.Exclude” as well as “Collections.HighValue”, but they both agree that the type of the file is “Customers”. In other words the business domain gives me the purpose, but we’ve agreed on the type.

It is great to agree, but on identification, not everything

The reason I give these examples is that sometimes the need to get people to agree is held up as a barrier to progress. It is super nice to be able to get, at the high level, agreement on these key terms, particularly for master data, and it can indeed save you a whole heap of pain down the road if its possible. This doesn’t mean some “golden record” , its about agreeing on how you uniquely identify the ‘thing’, that doesn’t even mean we need the same perspective on what the “thing” is.

Two different names for the same thing, in a single country

Lets take a conglomerate, one division makes parts for satellites which it sells to lots of companies, another division actually makes the satellites. In the first division their “product” is the part, and that product has its own parts. The customer for the first division includes the second division, and a bunch of other companies. In the second division the product from the first is just one of the many parts that goes into its product, and its product is sold to customers. Does this mean that both divisions need to agree on a single meaning of a term, or that from two different perspectives they can actually work together? Well it clearly needs to be the latter. Which matters more, the name on the ‘thing’ or that the definitions of what it is enable the two divisions to collaborate? Again, clearly the latter.

In a world of ecosystems its identification that matters, not the name

As we look to collaborative data ecosystems this identification over naming becomes more important, don’t sweat it that the car OEM refers to your product as a “part”, don’t worry that you think the driver should be considered a “consumer” and is defined as an OEM. What we’ve got to agree on is how we identify the “things” what we call them isn’t important. Focus on the identification, not the title. It is perfectly ok to have a different view, within your context, of what something is called. At the transactional level, my Goods Out, becomes your Goods In, the shipment between the two, the “Goods Out Shipment” is what you receive.

The companies and governments involved in an ecosystem will normally have different perspectives and have their own, often pretty fixed, views on the names of ‘things’. If we take international trade as an example. In order to make it easy we might decide that we use English to define the units and quantities. This doesn’t mean that internally within my company that everyone has to start speaking English. If you call it “a box”, I call it “une boîte”, and someone else says “blwch” but we agree to use “box” on the shipping notes, that doesn’t mean you have to use box when working internally.

Or as well know Data Governance expert Shakespeare put it:

A rose by any other name would smell as sweet — William Shakespeare

Data Governance is about the boundaries, not the totality

In all of this we need to understand that there is no totality of Data Governance. This is why Gartner is saying abandon the Customer 360, why Quality is contextual, and why Golden Records aren’t a great target. We need to govern at the boundaries, and in the context of those boundaries. Whether internally or externally the important part is we agree on where we collaborate, in the instance at which we collaborate. If after that instance we call it something else, it doesn’t matter.

So if we do agree then it certainly can be easier, but too often we let agreement on high-level terms be the goal, rather than letting different divisions, departments and companies that are working together apply their own name to the ‘thing’ that we are collaborating around. We don’t need to enforce our viewpoint beyond that boundary into another group, and we do not need to agree to that definition being used internally within ours. As long as at the point of interaction we can collaborate we could call everything “Avocado Blue” if we wanted to.

Embrace the Avocado Blue approach to naming

The problem can be that sometimes people become very wedded to ‘their’ name and struggle to agree to ‘someone else’s name’ being used. We might use the term “Natural Person” or “Corporate Entity” to be neutral on whether its a customer, supplier or contact. The challenge in those can be that the processes and approaches are different between customers and suppliers even if many, or all, of the identification attributes are the same. Don’t be afraid of creating your own Avocado Blue, “Sales Side Party”, “Buy Side Corporation”, picking a neutral term that everyone hates and therefore insists on having their own mapping to that term is often a good way to get everyone aligned on the challenge.

At a company a while ago they organized their sales regions very differently around the globe, from sales regions that spanned fragments of a few countries through to vertical industry focused geographical segments in a country, where the geographical segments were specific to vertical. In order to look at sales performance, sales margin and a bunch of other corporate metrics we needed a name, and everyone had a viewpoint on what it was right. The utterly dreadful name that one genius proposed after a deliberate brain storming for bad names we’d all hate:

Dimensional Entity — Bounded Sales Hierarchy

Or “Debsh”. Why that? Because they’d just spoken to their partner who had accused them of being drunk, and they’d slurred the name. Everyone agreed that it was utterly terrible, no way it would work. When I spoke to them a couple of years later DEBSH reporting was a fully fledged system within Sales effectiveness.

--

--

Steve Jones
Collaborative Data Ecosystems

My job is to make exciting technology dull, because dull means it works. All opinions my own.