Testing Tableau Server Data Acceleration
Version 2021.1 Update: Starting in Tableau version 2022.1, with the introduction of View Acceleration, the data acceleration feature is deprecated. Requests for data acceleration endpoints and attributes will not return an error, but will also not cause any change on the server or return meaningful information about data acceleration of a workbook.
Starting from version 2020.2, Tableau Server administrators can enable data acceleration for specific workbooks.
An accelerated workbook loads faster because Tableau Server pre-computes the workbook’s data in a background process (fetching the data needed after connecting to the underlying data source).
Workbooks are not enabled for acceleration by default. The easiest way to configure data acceleration is to use the accelerate_workbooks.py
Python script. It is also possible, but more difficult, to configure data acceleration using the Tableau Server REST API.
Data Acceleration supports:
- Workbooks with published and live data sources (both with embedded credentials)
- Workbooks with embedded extracts
Workbooks with published and live data sources need to be added to a schedule (a schedule of type DataAcceleration) because Tableau Server needs to run the background precomputation periodically.
Workbooks with embedded extracts do not need to be scheduled, the acceleration will be updated when they are published or when their extract is refreshed.
Data Acceleration doesn’t support:
- Workbooks with encrypted extracts
- Workbooks that include user-based, now(), today() functions
- Federated data sources
- Data Blending (partially supported for acceleration, data against the secondary data sources are not accelerated)
For more information about data acceleration, what is supported and not supported, and the resource implications of configuring it, see Data Acceleration in the Tableau Server documentation.
In this post we will try to speed up a workbook with a live connection to a slow data source to verify the real performance benefits it introduces.
Prerequisites
Before using the Data Acceleration Python Client script you need the following:
- Tableau Server 2020.2+
- An account with Tableau Server Administrator or Site Administrator role.
- Optional (but recommended) increase the size of the Tableau Server external cache to 2 GB or larger.
Check your current Tableau Server external cache size setting:tsm configuration get -k redis.max_memory_in_mb
Set the Tableau Server external cache size to 2 GB:tsm configuration set -k redis.max_memory_in_mb -v 2048
tsm pending-changes apply
Download the source code from here (you need Python 3.5+) and use the setup.py
script to verify and install the dependencies:
- python-dateutil
- PTable
- tableauserverclient
If you have already installed the Python Tableau Server Client (tableauserverclient) you can install manually the dependencies using pip pip install python-dateutil
pip install PTable
How to change the authentication method
Unfortunately, the current version of Tableau Server Data Acceleration Client (v0.1) supports only authentication using username and password.
Instead, for security reason, I recommend to use a Personal Access Token (PAT) as documented here, avoiding to sharing the credentials and insert username and password in the Python code.
How can we solve this issue?
Taking advantage of the fact that the Tableau Rest API signin method has a very similar request body, both when using Username and Password and when using a Personal Access Token, we may change easily the authentication method in the accelerate_workbooks.py
script, replacing only a single line of code! Search for the sign_in_to_server method and change the python code as follow:
#credentials_element = ET.SubElement(xml_request, 'credentials', name=username, password=password)
credentials_element = ET.SubElement(xml_request, 'credentials', personalAccessTokenName=username, personalAccessTokenSecret=password)
Now, if we create a Personal access token with Name = ACCELERATE
and try to sign in using this Token Name and Secret instead of username and password
python accelerate_workbooks.py --server "https://myserver" --site "DemoSite" --username "ACCELERATE" --password "vWsr0S75QVSSQ8LDHdxiTA==:YCP70XthyAPjnpuPzm3AmtNDLkQW5yPr"path to ssl certificate (hit enter to ignore):Signed in to https://myserver successfully
… it works (hit enter to ignore the ssl certificate)!
To logout, we can use this command line:
python accelerate_workbooks.py --logout
Accelerate a workbook with a live data source connection
In this demo we try to to accelerate a Tableau Workbook with a live connection to a PostgreSQL database in cloud (Heroku Application Cloud).
Using the admin view Stats for Load Times, we see the dashboard has a load time of about 10 seconds:
From now on we will check the Tableau Server Data Acceleration status using this command line
python accelerate_workbooks.py --statusWorkbook Acceleration is enabled for the following workbooks: NoneScheduled Tasks for Workbook Acceleration: None
Currently no workbook is accelerated and no task for acceleration is scheduled.
We enable the first workbook for acceleration with this command
python accelerate_workbooks.py --enable "Data Acceleration/Cloud Heroku LIVE" --accelerate-nowWorkbooks Enabled
+-------------------------------------+
| Project/Workbook |
+-------------------------------------+
| Data Acceleration/Cloud Heroku LIVE |
+-------------------------------------+
where
- Data Acceleration/Cloud Heroku LIVE is the path of the workbook to speed up (Project/Workbook name)
- accelerate-now is on option, to submit a backgrounder pre-computation job during the enablement (on demand)
Checking again the status, we have enabled the acceleration for this workbook and his background acceleration job has status in progress
python accelerate_workbooks.py --statusWorkbook Acceleration is enabled for the following workbooks
+----------+-------------------------------------+------------+-----
| Site | Project/Workbook | Status |
+----------+-------------------------------------+------------+-----| DemoSite | Data Acceleration/Cloud Heroku LIVE | inProgress |
+----------+-------------------------------------+------------+-----
We can monitor the background job execution using both Tableau Server > Jobs
and the status command line
python accelerate_workbooks.py --statusWorkbook Acceleration is enabled for the following workbooks
+----------+-------------------------------------+-------------+----
| Site | Project/Workbook | Status |
+----------+-------------------------------------+-------------+----| DemoSite | Data Acceleration/Cloud Heroku LIVE | accelerated |
+----------+-------------------------------------+-------------+------------------------------+--------------------------+
Last Updated | Task Running Time (Secs) |
--------------------------+--------------------------+
2021-02-09 19:06:05+01:00 | 25.0 |
--------------------------+--------------------------+
When the task is complete and the workbook has been accelerated, we test the acceleration benefits by opening the workbook multiple times and using the Stats for Load Times view. Now the average load time has significantly decreased to 0.2 seconds (compared to the previous 9.8 secs).
Why schedule acceleration?
Using again the status command, we can view the list of scheduled tasks for workbook acceleration:
Scheduled Tasks for Workbook Acceleration
+-------------------------------------+----------+-------------+
| Project/Workbook | Schedule | Next Run At |
+-------------------------------------+----------+-------------+
| Data Acceleration/Cloud Heroku LIVE | * | |
+-------------------------------------+----------+-------------+
*The Workbook Acceleration views for these workbooks will be updated when they are published, or when their extract is refreshed.
Keep in mind that Tableau Data Acceleration will monitor the relevant Tableau events that could potentially change the workbook’s data, such as:
- Workbook publishing
- Extract refreshing (if the workbook has any)
- Web authoring
and pre-computation will be triggered after these events.
In this case we have a workbook with a live data connection, so no other event will be triggered and, after 720 minutes from the first acceleration Tableau Server will cleaned the data cache (where 720 mins is the default data cache lifetime). To avoid this, we need to create a Data Acceleration Schedule and associate the workbook to this one to maintain its data cache updated.
So let’s continue creating an acceleration schedule called “Schedule 2hh” that runs all days of the week, every 2 hours from 06:00 to 20:00, using the command
python accelerate_workbooks.py --create-schedule "Schedule 2hh" --hourly-interval 2 --start-hour 6 --end-hour 20Hourly schedule "Schedule 2hh" created with an interval of 2 hours.
Note that, if we try to create an acceleration schedule with a frequency larger than the Tableau Server data cache lifetime (720 minutes by default), we’ll get a warning message because data acceleration will not be useful:
python accelerate_workbooks.py — create-schedule “Schedule USELESS” — daily-intervalDaily schedule “Schedule USELESS” created to run at 00:00.Warning: The recurrence interval of the given schedule is larger than VizQL server data refresh interval of 720 minutes.
We can view the data acceleration schedules using the command show-schedule
python accelerate_workbooks.py --show-scheduleData Acceleration Schedule
+--------------+---------------------------+
| Name | Next Run At |
+--------------+---------------------------+
| Schedule 2hh | 2021-02-11 14:00:00+01:00 |
+--------------+---------------------------+
or directly on Tableau Server > Schedules (where there is an Edit Settings button but it doesn’t run!)
As a last step we’ll use the command add-to-schedule to add our workbook to the acceleration schedule and check again the status to verify that everything runs correctly:
python accelerate_workbooks.py --add-to-schedule "Schedule 2hh" "Data Acceleration/Cloud Heroku LIVE"Workbooks added to schedule
+-------------------------------------+--------------+
| Project/Workbook | Schedules |
+-------------------------------------+--------------+
| Data Acceleration/Cloud Heroku LIVE | Schedule 2hh |
+-------------------------------------+--------------+python accelerate_workbooks.py --statusScheduled Tasks for Workbook Acceleration
+-------------------------------------+--------------+-------------
| Project/Workbook | Schedule | Next Run At
+-------------------------------------+--------------+-------------
| Data Acceleration/Cloud Heroku LIVE | Schedule 2hh | 2021-03-31
+-------------------------------------+--------------+-------------
Using the Tableau Server > Jobs page we’ll be able to monitor the acceleration job execution every two hours:
Conclusion
The initial loading time of a web page is one of the factors that most influence the user experience, so the “spinning wheel” for countless seconds when we try to view a dashboard on Tableau Server has a very poor impact.
Tableau Server Data Acceleration enable workbooks loading faster because pre-computes the workbook’s data in a background process (fetching the data needed after connecting to the underlying data source) as long as there are no RLU filters or time functions as Now() and Today().
But Data Acceleration is not a solution when we create dashboards not optimized, for example composed of dozens of worksheets. In this case a suggestion is following the guidelines in this topic to improve the speed of your visualizations.
If you need further info just reach out to me on LinkedIn or Twitter.