Tableau Data Catalog: Let’s do the jigsaw puzzle!

Allowing our Tableau users to finally access the metadata

Caroline BURIDANT
iAdvize Engineering
6 min readFeb 11, 2022

--

This article is the fifth — and last! — of our series dedicated to the creation of a Data Catalog based on the Tableau Metadata API. So far, we have explained the origin of this project (article 1), and exposed how we gathered metadata for:

  • upstream tables and published datasources (article 2),
  • upstream columns and fields (article 3),
  • sheets, dashboards and workbooks (article 4).

Now we will focus on the outcome of this work — that is, the resulting data catalogs available for our Tableau users.

Let’s assemble the pieces (Photo by oatawa on Shutterstock)

Designing the tools

To ensure the adoption of our Data Catalog, we started its design by identifying the target users. According to our site roles allocation and our internal structure at iAdvize, we determined these metadata were addressed to 3 user groups with very different goals:

  • the Data team members need to ensure the consistency of the content exposed via Tableau. They also have to master the impact of a change made on the origin of Published Data Sources;
  • the Viewers need to access the description of the Tableau fields they find in a workbook available on Tableau Server. Most importantly, they need an interactive tool to help them find an analysis containing a KPI they are looking for;
  • the Creators need help to find the right datasource for their analysis, based on the label of a specific KPI they would like to study. Also, it would be useful for them to see the formulas of the calculated fields created by our Tableau community — to find inspiration and follow best practices for their own analysis.

Thus the result of the project has 3 deliverables — one for each target. As said before, to ensure the adoption of these tools, we chose to create them with Tableau Desktop and make them available on our Tableau Server.

A Data Catalog for the Data team

The tool the Data team was dreaming of had 3 features:

  • a dependency-checking tab, to monitor the consequences of an addition, deletion or edit of a column on the datasources;
  • a consistency-checking tab, to verify that the field names, descriptions and formulas are coherent among the datasources;
  • a resource management-checking tab, to detect if a datasource contains some fields without any description.

To create this tool, we needed to assemble our metadata tables into one table containing information about:

  • the upstream tables and columns,
  • the published datasources and fields.

Thus, this joined table should show one row per field and per upstream column (in the case of a calculated field, there can be more than one upstream column), and should contain the field’s metadata (name, description, formula, folder, published datasource) as well as its upstream columns and tables. An SQL query generates this table and is scheduled daily in our data warehouse.

Then we created a Tableau published datasource connected to this table, and we called it « Upstream Column by Datasource Field ».

Here are three views of this first workbook intended for the Data team :

The dependency-checking tab
The consistency-checking tab
The resource management-checking tab

To create the last one, we used the Tableau Actions functionality, to allow the user to complete the missing descriptions based on the common descriptions for similar fields hosted in other datasources (given by the Consistency tab). We precise that the completion of the field’s descriptions is made by hand (downloading the Published Data Source, editing the description, publishing again the Published Data Source), as Tableau does not offer an easier way to do it.

A Data Catalog for the Viewers

For the Viewer users in our Tableau community at iAdvize, we designed two tabs:

  • the « Fields Definition » dashboard allows them to read the description of a field used in a workbook they would like to use. Indeed, as the Viewers do not have access to the published datasource details, they can’t access this metadata otherwise. The tab should also allow them to read the field’s formula;
  • the « Looking for an Analysis » dashboard helps them find the workbook, or even the sheet they need for an analysis, based on a keyword.

To create this tool, we needed a datasource that compiles information about:

  • the workbooks, dashboards and sheets,
  • the fields used into it.

The required metadata table would have one row per field used in a sheet. It should contain information about the field itself, and the lineage of the sheet (the related dashboards and workbooks). An SQL query generates this table and is scheduled daily in our data warehouse.

Then we created a Tableau published datasource connected to this table, and we called it « Fields used by Sheet ». The Data Catalog Viewer workbook connects to this datasource, and here is an overview of its two features:

The “Fields definition” tab
The “Looking for an Analysis” tab

In addition, a « User Guide » tab provides the user with instructions on how to use this tool correctly.

A Data Catalog for the Creators

Regarding the Creator users, the Data Catalog we designed was aimed at helping them while:

  • looking for a datasource, based on a specific keyword corresponding to a datasource field;
  • creating a calculated field, by finding inspiration and following good practices for existing calculations.

To create this tool we did not have to create another datasource, as the first one (« Upstream Column by Datasource Field ») perfectly fits the need. We created a workbook connected to this datasource, and you can find below a demonstration of this last tool:

The “Looking for a datasource” tab
The “Looking for a calculation” tab

Further improvements

Since this first version of our homemade Data Catalog was only made available to our Tableau users in December ’21, we are yet to be able to draw any insights and evaluate its adoption. However, we already have some ideas to improve the tool:

  • add the parameters metadata to track the usage of this Tableau functionality,
  • gather information about the published datasources containing a Custom SQL query, to make maintenance easier,
  • track the published datasource filters,
  • find a way to search a view or a published datasource using more than one keyword…

This was the last article of our series about the creation of a Tableau Data Catalog. If you had a similar experience or if you are willing to, feel free to reach us!

--

--