Danny Sidani
Slalom Data & AI
Published in
8 min readNov 6, 2020

--



Photo by The Creative Exchange on Unsplash

How Tableau’s New Data Relationships Feature Will Help You Unlock the Potential of Your Organization’s Data

Written by Nicole (Hamilton) Egan and Danny Sidani

Tableau has continued to release new features that underscore its position as a leader in self-service analytics. Tableau Prep, Server and Data Catalog, along with its ease-of-use, all work together to empower analysts and business users to better access and understand their data. Tableau’s new feature, Data Relationships, was unveiled at Tableau Conference 2019. Cleverly coined “Noodles”, the feature was released in version 2020.2 and it did not disappoint. Data Relationships is much more than a new feature; it turns Tableau into a self-service powerhouse. In this blog, we will discuss how Data Relationships, if leveraged correctly, can transform the way your organization uses data and accelerate speed to insights.

What is the Data Relationships feature?

Prior to Data Relationships, Tableau combined each data source into a single table. Often times, this led to unintended row duplication and performance issues. This required report developers to rely on data blending or complex level of detail calculations to handle reporting at multiple grains. Applying several different solutions to solve the same problem also led to inconsistent dashboard development practices and unreliable results. Let’s dive into an example of how joins can lead to unexpected results that can be rectified by the Data Relationships feature.

Sarah is a data analyst at a healthcare practice who wants to know how many hours providers spend completing appointments versus their scheduled panel hours. She boots up Tableau, connects to her database and joins the relevant tables together:

Joining all the tables in the model leads to unexpected data duplication

Seems simple at first, but she quickly notices an issue. While the Schedule Provider Dim table has one record for each provider, the Appointment Dim table has many records for each provider, causing duplication when joined. Each provider’s scheduled hours are thereby multiplied by the number of appointments they completed. Before Tableau 2020.2, Sarah would have either needed to use level of detail calculations, blending, or work with her database team, to create a reporting view to mitigate the issue.

Let’s see what happens when Sarah instead uses Data Relationships instead of joins:

Using Data Relationships retains the native grain of each table, leading to the expected result

…The ability to intuitively create reports by connecting directly to a data model and asking questions of the data as you go… not only impacts individual analysis, but also changes the way organizations provide and govern self-service data sources in Tableau.

Now that Sarah is using Data Relationships, she gets the expected visualization right away. Since relationships retain the native grain of each table, data from the Appointment Dim table was summarized at the provider level and then related to the Schedule Provider Dim table which avoids duplication. This example highlights the impact of data relationships; the ability to intuitively create reports by connecting directly to a data model and asking questions of the data as you go. This not only impacts individual analysis but also changes the way organizations provide and govern self-service data sources in Tableau.

For example, Sarah can also publish this data source for others to use. Since she did not need to use complex level of detail calculations, the data source remains intuitive for others. Bundling use-case specific calculations into published data-sources can cause confusion. Data Relationships account for a wide variety of use-cases and avoids the need to train others on when to use different calculations based on specific scenarios.

She could have used a blend to solve for her use case, but this also acts as a one-time solution, since blend relationships are not captured in published data sources. The introduction of Data Relationships allows for more complex data to be delivered to a broader group of users for self-service consumption.

Data Relationships = Not Your Traditional Joins

To recap, Data Relationships differ from traditional joins in several ways:

The ability to connect directly to a data model through data relationships simplifies the process to create validated data sources for your organization. Before data relationships, the best performance was achieved by anticipating each analytical use-case in advance and building a reporting structure that suited it. While users were free to join the tables together themselves, they would often run into performance issues, results that did not make sense, or would need advanced Tableau skills to properly use blends or level of detail calculations to compute values at the correct aggregation.

To make self-service approachable, business intelligence teams often built reporting tables that mitigated granularity issues that would appear when connecting directly to data models and joining tables. However, any new questions that did not fit cleanly within your self-service tables would generate the need to create a new table. The upfront decisions required to build new reporting structures reduces flexibility in reporting and limits the agility of reporting teams. Data relationships enable reporting teams to use data models directly and publish certified data sources without the need of single use reporting tables.

Data Relationships minimize the need to build rigid one-off reporting tables

4 Tips to Fully Leverage Data Relationships

You have already invested in Tableau, so how do you get the most out of this new feature. Below are some tips on how to use Data Relationships to enable real change in your organization:

1. Model your organization’s data

Data relationships will maximize the investment you have made in your dimensional model by no longer requiring the schema to be obscured behind flat reporting tables. With the investment in this new feature, Tableau is emphasizing dimensional models like star and snowflake schemas. As Tableau builds out the Data Relationship feature complexity, it will only support more complex dimensional models. Removing reporting tables drives data consistency, improves reporting agility, and reduces data warehouse resource use.

2. Follow dimensional modeling best practices

The impact of Tableau’s Data Relationships is maximized when data is well-modeled.

  • Model metadata can be leveraged directly from Tableau Server tools: Naming and comments in the data warehouse can be used since Tableau will connect directly to the data model schema.
  • Define Data Properties: Define data properties such as cardinality (relationships) and referential integrity (related records) in the data model to optimize the performance of Tableau Relationships. For example, if you have an appointment record that comes into the data warehouse before the provider record (sometimes known as a late-arriving fact in dimensional modeling), use an “unknown” provider row instead of leaving the provider ID as null.
  • Separate measures and dimensions into separate tables: In dimensional modeling, measures are known as “facts” and are kept in separate tables to maximize performance since fact tables typically contain many rows and dimensions are more static. Data Relationships works best when measures and dimensions are in separate tables, as this improves performance.
  • Model your data according to business process, not dashboard requirements: Approaching data modeling through the lens of business use cases and processes will allow for more flexibility in how the data model can be leveraged across an entire organization and not just for a single dashboard use case.

3. Master cross-enterprise attributes and measures into “conformed dimensions” for consistent reporting

Common dimensions such as customer name may overlap across multiple source systems. For example, an ERP system and Salesforce will both contain information about customers. To use customer information from both systems seamlessly, it is important to build conformed dimensions.

‘Customer’ is considered a conformed dimension when the sources are consolidated into a single table that can be used across all business processes. Conforming dimensions is a necessary challenge because it reduces reporting inconsistencies by avoiding multiple customer tables based on different source systems. The ability to use a single source of truth directly in Tableau through data relationships simplifies governance and reduces questions of why reports do not match.

Since data relationships can leverage data models directly, creating conformed dimensions ensures consistency and reliability across many use-cases. Mastering your data enables report consistency and maximum return on data modeling projects.

Mastering your data enables report consistency and maximum return on data modeling projects

4. Use Tableau Server Catalogue, Metadata, and Certified data sources

Data Relationships allow users to leverage the data model directly, enabling data lineage tools such as Tableau’s Catalog to become more powerful. Reporting tables often acted like black boxes that obscured the source of reporting tables and useful metadata in the data model. Maximizing data relationships allows Tableau’s Data Catalog to track fields from the data model directly to reports. This is invaluable for analytics teams who need to evaluate the impact of database changes and ensure they are driving data consistency.

The release of Data Relationships fundamentally changes how to best foster, support and govern self-service data analytics…

Data Relationships can transform the way organizations can interact with data through Tableau. No longer are reporting teams bounded by rigid joins and reporting tables. The release of Data Relationships fundamentally changes how to best foster, support and govern self-service data analytics with modeled data marts and we expect Tableau will only increase the data model complexity it can support. Start investing in your data governance and dimensional model today to democratize your data consistently and enable data-driven decision making.

Special thanks to Matthew Falkenham, Francine Klein, Greg Bonnette, and Wale Ilori.

If you are using Tableau today and want to learn more about how this new feature can make a big difference for your analytics organization, reach out to us at Slalom Data & Analytics, we would love to talk!

Are you interested in reading more about how to maximize Tableau and data in your organization? Check out the blogs below!

--

--