Possible Options Of Creating Data Sources in Tableau Desktop
When working in Tableau Desktop the first thing that we need to do is to connect to the Data and in the process we create Data Sources.
Given below are the possible options of creating Data Sources in Tableau Desktop
Option 1: 1 Data Source with 1 table
Option 2: 1 Data Source with Multiple Tables (2 or more)
Option 3: 2 Separate Data Sources within the same Workbook
Option 4: 1 Data Source with Multiple Connections
NOTE: In all these options the process of combining the data happens on the Data Source Page.
A Tableau data source may contain multiple data connections to different databases or files.
A Tableau Data Source is the link between the source data and Tableau.
A Tableau data source consists of the below mentioned components:
a) The actual data (either as a live connection or an extract) e.g. Embedded Flat Files
b) The connection string to the underlying database or the reference to a Tableau virtual connection
c) The names of tables or sheets containing data
d) Description of Fields
e) Any Metadata changes (the customizations) done on top of the data to work with it in Tableau. These customizations may include things like how the data is combined and metadata such as Calculations, Renamed fields, Data Type changes, Hierarchies, Groups & Folders and default formatting etc.Hence there is a difference between the Tableau data source (Tableau’s access to the data and all the customizations that may be involved) and the data source itself (such as a database or Excel spreadsheet).
A published data source contains all the details that anyone would need to start building visualizations/views in Tableau.
NOTE: For this article Tableau Version 2022.2 (Professional Edition) is used.
Given below is the detailed explanation for each of the options
Option 1: 1 Data Source with 1 table
In this option the data comes from a single table of one data source.
Given below are the steps to create the Data Source for this option
Step 1: Connect to the data from the Connect pane. Click the desired file type under To a File. In the Open dialog box select the required file (e.g. Sample — Superstore_Single_Table.xls) and click Open. This creates the Data Source.
Step 2: We are next taken to the Data Source Page where the table in the data i.e. Orders is automatically dragged into the Canvas area.
This type of Data Model is referred to as an Implicit Data Model and is created by Tableau without an analyst having to do anything to create the model.
The analyst can simply connect to data and immediately start creating visuals in the worksheets.
This type of data modeling works well when the data source has already been prepared for analysis and we are the person creating the charts and dashboards.
Step 3: Click on Sheet 1. We are next taken to a screen that will look something like this. This is the Tableau Workspace. The left side is the Side Bar that contains both the Data pane and Analytics pane and the blank area in the center is the View/ Viz/ Canvas where we can create the required visualizations.
Option 2: 1 Data Source with Multiple Tables (2 or more)
In this option the data comes from multiple tables of one data source.
Given below are the steps to create the Data Source for this option
Step 1: Connect to the data from the Connect pane. Click the desired file type under To a File. In the Open dialog box select the required file (e.g. Sample — Superstore_Multiple_Tables.xls) and click Open. This creates the Data Source.
Step 2: We are next taken to the Data Source Page. Here we need to drag and drop the required table i.e. Orders into the Canvas area.
Step 3: If required next we can add a second table e.g: Returns in the Logical Layer itself using Relationships (A.k.a Noodles).
Alternately we can we can add the second table in the Physical Layer by using Joins.
To navigate to the Physical Layer we need to either double-click the existing table in the canvas area or right-click the existing table and select Open.
Now we are in the Physical Layer.
Drag and drop a second table e.g: Returns table
Step 4: Click on Sheet 1. We are next taken to a screen that will look something like this. This is the Tableau Workspace. In this option we see that there are 2 separate tables in the Data pane i.e. Orders and Returns. We can create visualizations by using fields from both tables.
Option 3: 2 Separate Data Sources within the same Workbook
This option creates an entirely new data source.
This option can be used to create Blends.
Given below are the steps to create the Data Source for this option
Step 1: Connect to the data from the Connect pane. Click the desired file type under To a File. In the Open dialog box select the required file (e.g. Sample — Superstore_Multiple_Tables.xls) and click Open. This creates the Data Source.
Step 2: We are next taken to the Data Source Page. Here we need to drag and drop the required table i.e. Orders into the Canvas area.
Step 3: Next we can add a second data source
This can be done by by clicking the drop-down menu next to Cylinder icon displaying the Data Source Name and clicking on New Data Source.
This can also be done by using the Data menu i.e. Data > New Data Source.
Step 4: In the Open dialog box select the required file (e.g. Global Superstore.xls) and click Open. This creates the Second Data Source.
Step 5: We are next taken to the Data Source Page of the Second Data Source. Here we need to drag and drop the required table i.e. Orders into the Canvas area.
Step 6: Click on Sheet 1. We are next taken to a screen that will look something like this. This is the Tableau Workspace. In this option we see that there are 2 separate Data Sources in the Data pane (displayed as two cylinders). We can create visualizations by using fields from both the data sources (explained in the next steps).
Step 7: Let us drag and drop a dimension e.g. Sub-Category from one Data Source e.g: Orders (Sample — Superstore) into the view. This Data Source will be considered as the Primary Data Source and will be displayed by a blue check mark on the Data Source.
Next let us drag and drop a measure e.g. Sales from the Orders (Sample — Superstore) into the view.
Next let us drag and drop a measure e.g. Shipping Cost from the Orders (Global Superstore) into the view. This Data Source will be considered as the Secondary Data Source and will be displayed by an orange check mark on the data source and an orange bar down the side of the Data pane.
In this option we can switch between the two (or more) Data Sources while on the data source tab (by clicking the drop-down menu next to Cylinder icon displaying the Data Source Name and selecting the required Data Source).
Option 4: 1 Data Source with Multiple Connections
This option creates a second connection rather than an entirely different data source.
This option can be used to create Joins (in the Physical Layer) or a Relationship (in the Logical Layer).
This option can be used to combine data from the same data source (such as tables in a database or sheets in an Excel spreadsheet) or across different data sources (this is known as a Cross-database join).
Given below are the steps to create the Data Source for this option
Step 1: Connect to the data from the Connect pane. Click the desired file type under To a File. In the Open dialog box select the required file (e.g. Sample — Superstore_Multiple_Tables.xls) and click Open. This creates the Data Source.
Step 2: We are next taken to the Data Source Page. Here we need to drag and drop the required table i.e. Orders into the Canvas area.
Step 3: If the second table is from another data source entirely, in the left pane, under Connections, click the Add button to add a new connection to the Tableau data source.
Step 4: In the Add a Connection option select the desired connection type e.g: Microsoft SQL Server and enter the required credentials e.g: Server, Database (optional). Select the desired Authentication mode. Next click on Sign In.
Step 5: Select the required table and configure the join. Add one or more join clauses by selecting a field from one of the available tables used in the data source, choosing a join operator, and a field from the added table.
Step 6: Click on Sheet 1. We are next taken to a screen that will look something like this. This is the Tableau Workspace. In this option we see that there are 2 separate tables in the Data pane i.e. Orders (from the Microsoft Excel file) and returns (from the Microsoft SQL Server instance). We can create visualizations by using fields from both tables.
In this option we can switch between the two (or more) connections while on the data source tab (Left Pane).
In the below image we have selected the Microsoft Excel file connection. Hence the details shown in the Left Pane will be based on the Sheets present in the file.
In the below image we have selected the Microsoft SQL Server connection. Hence the details shown in the Left Pane will be based on the Database, Tables etc. of the SQL Server connection.
If we have combined tables using a cross-database join, Tableau colors the tables in the canvas and the columns in the data grid to show us from which connection the data comes from. In this example the Microsoft Excel file is colored Blue and the Microsoft SQL Server Database connection is colored Orange.
References:
For the Tableau Data Source and Implicit Data Model definitions -
“Part 1: Data Modeling on the Tableau Platform — Chapter 1 Introducing Data Modeling in Tableau” & “Chapter 2 Licensing Considerations and Types of Data Models” from the book “Data Modeling with Tableau” by Kirk Munroe