Data Dictionary: a how to and best practices

Carl Anderson
7 min readJun 21, 2018

A data dictionary is a list of key terms and metrics with definitions, a business glossary. While it is sounds simple, almost trivial, its ability to align the business and remove confusion can be profound. In fact, a data dictionary is possibly one of the most valuable artifacts that a data team can deliver to the business.

Most businesses have at least one concept, term, or metric that is used or interpreted differently among teams. When this happens, confusion reigns. Decision makers may disagree about what the data show and what actions to take. Reports among teams might show different numbers for the same metric from the same data source due to inconsistent business logic. Teams may even argue about the correct definition and defend their turf, perhaps because their definition makes their numbers look better. This is not good for business.

Once you have a data dictionary, it is a document that all staff can reference and be on the same page, it makes onboarding new staff easier, and the business intelligence (BI) team have crystal clear requirements for implementation of those metrics.

To be clear, here, we are not considering raw database table documentation although that is important too, but a higher-level list of business terms and metrics. How does the business as a whole think of “user”, “revenue”, or “cost of acquisition”? Does everyone have the same understanding or “sales territory”, “average ship time”, or “session”? The goal should be that a junior, non-technical…

--

--

Carl Anderson

Director, Data Science, Indigo Ag. Author of "Creating a Data-Driven Organization" (2015, O'Reilly). Web: carlanderson.ai