Data Layers in Data Studio
Understanding the difference between DWH data, data source, and chart data. By Alexander Junke
After building your first Data Studio dashboards, you will start developing a better feeling of how to visualize data. But then there is still another important aspect missing, in particular how data management works in Data Studio. There are 3 layers of data: the tables in the Data Warehouse, the data source in Data Studio and the data of the different charts. In this blog post, I want to show how these 3 data layers collaborate and for example how filters or custom metrics inflict the different data layers.
Beginners and Intermediates in Data Studio should read this blog post.
Overview
First, I want to take a closer look at how to connect and use a new data source in Data Studio. The first data layer is the table inside BigQuery. If you want to create a new data source in Data Studio you have to choose a connector (in our case BigQuery) and then go down the hierarchy project, dataset, and select the table.
Now, the second data layer is created and you see all available fields of the data source. To use the third data layer, you can create a new chart that is based on the Data Studio data source.
In the next sections, I want to compare the different data layers with each other.
1. BigQuery table and Data Studio data source
The data source is a “mirror” of the BigQuery table. When you update the data of the BigQuery table and you reload the data of the report (or data source) you have the new data in the connected charts too.
When you add a new column to the table you have to refresh the fields in the data source.
A good question to understand this connection better is: What happens if you connect 2 different data sources to the same BigQuery table and is it possible?
At first, yes it is possible without any problems, but you have to know the side effects of this approach. You can create 2 data sources (A and B) in Data Studio and both data sources mirror the data of the same BigQuery table. Lets’s say you create report A with data source A and report B with data source B.
When you create the same chart in both Data Studio reports, then it shows the exact same data.
Now let me explain the case you update the table in BigQuery.
- Refreshing the data of report A, then the chart data gets updated in report A.
- Data source A and all connected charts are up-to-date now
- Report A and B show different data, because data of report B is not up-to-date
- Updating the data of report B, then both reports show again the correct data
This is not the case when you have both data sources connected to the same report. Then you have to refresh the data of the report one time and both data sources have the new data inside. To refresh the data you can open or refresh the report again too.
Another behavior comes into effect when you add a new column to the BigQuery table.
- To add the new field/column to the data source, you have to refresh the fields of the Data Studio data source
- Refresh the fields of data source A, then the new field is available in data source A, but not in data source B
- To add the new field to this data source, you have to refresh the fields in data source B too
Another important aspect is the creation of new fields inside the data source. This does not change the BigQuery table, just the data source.
- Create a custom field in data source A then the data source B does not have this field automatically
The last topic is to copy a data source.
- Copy an already created data source, which has the big advantage that all custom fields are available in the original and copied data source
- The copied data source is connected to the same table in BigQuery as the original data source
- When you change the connection of the new data source to another BigQuery table, the original data source is not affected by this.
2. Data Studio data source and Data Studio charts
In this section, I want to concentrate on the two data layers inside Data Studio. Every chart in Data Studio is based on one data source. After the chart is created, you can change the data source of this chart at any time.
One important topic for these two layers is filtering.
- Adding a filter to one of the charts, then you just filter the visualized data of this chart.
- The data source is not affected by this filter
- Currently there is no way to filter the data source, you have just many different possibilities to select the visualized data (Check out this blog post of our Data Studio series)
The creation of custom fields is interesting at this level again. You have the opportunity to create a custom field just inside a chart or inside a data source.
Add a custom field inside a chart:
- Not available in any other chart
- When you delete the chart, the custom field is deleted too
- When you copy the chart, the custom fields exist in the original and copied chart
Add a custom field inside a data source:
- Available in all charts connected with this data source
- When you delete the data source, the custom field is deleted too
- When you copy the data source, the custom fields exist in the original and copied data source
Conclusion
This blog post explained the different layers of data in Data Studio and their coexistence with other data studio elements, like filters or calculated fields. It is really important to understand the difference between a BigQuery Table, a data source and a chart in Data Studio. When you work more and more with Data Studio, you will get a better feeling of the data management inside Data Studio.
Further Links
This post is part of our Google Data Studio series in the Data School of datadice. We will be sharing the best insights that we’ve learned through years of creating data solutions.
If you want to learn more about how to use Google Data Studio and take it to the next level in combination with BigQuery, check our Udemy course here.