Tableau Version Control is here to make your life easier

If you are a Tableau server admin or interact with your Tableau admin, then the chances of you desperately crying out for their help over some accidental action is a quiet common possibility. Currently, retrieving just one workbook or select few items from a Tableau backup is time costly and counter-productive work. Imagine having to move the entire box that will take you five hours to move from your storage into your room just to get two items. Pretty rough huh? That is what Tableau admins currently have to go through in order to restore workbooks from a backup, but there is little more hassle to that too. Let me explain:

Currently, Tableau uses the all-or-nothing backup system when it comes to backing up your workbooks. What that means is that there is no way we can individually backup workbooks but in bulk that can only be then restored as a bulk. That translates to having to restore the entire backup into a test server then retrieve the workbooks that are needed. Depending on the size of your backup, this process takes a total of staggering 12–15 hours to complete. This gruesome process is required surprisingly often since people accidentally delete workbooks or older version is needed for various reasons. Tableau currently do have their own version of version control, however it is not the most optimal option. It also deletes all previous versions along with its newest version if accidentally deleted.

Luckily for you Tableau admins, we have a solution that would make your life easy and make others happy by getting that precious workbook in a fraction of the time than it normally takes to retrieve. We make this possible by running a version control on all the workbooks on the production server. The version control will scan all the workbooks and only backup the ones that have been modified since the previous backup (meaning it doesn’t waste time backing up something that hasn’t been changed since the last backup). Best of all, it will keep all the old versions as well! More importantly, you would have an ability to individually restore desired workbooks without having to go through the old all-or-nothing way. This saves a tremendous amount of time for all parties involved and adds the convenience of being able to access old workbooks at any given time. Lastly, this is an open-source tool, and you can sprinkle this amazingness into your life now if you’d like! Or if you have any questions so far, reach out to us at contact@starschema.com.

HOW IT WORKS:

Without our Version Control
With our Version Control

Let’s take a deeper dive into the technical side of things so you can install it on your server. Here is all you need to get started:

  • It’s an Open Source tool!
  • Requires Python 3.7 or later version
  • You would need git command line installed and configured
  • The script use Tableau Server Client library in Python to interact with the Tableau Server REST API

Ready to get started? Let me walk you through the steps:

  • Clone the git from here
  • Once you cloned the files then open the README.md which will walk you through the configuration part
  • After the configuration is done then all you have to do is run the python script tableauBackup.py along with the required arguments
  • You can schedule the script to run periodically by utilizing the Windows Scheduler or crontab(1)

Tableau Server Client library allows us to makes this process possible with less than 100 lines of code. Using TSCL, we can login to Tableau server, iterate through all the sites, iterate through each workbook and data source within the site, download the .xml of each workbook and data source, and back them up to git.


Want me to pour some more technical juice to satisfy your hunger for how things work? Good. That means you are 1% of the people who will most likely to read past this, and I salute you sir.

Here is how the script works:

  • First, we login to Tableau by using:
tableau_auth = TSC.TableauAuth(cfg[‘tableauServer’][‘user’], tableauPassword)
server = TSC.Server(cfg[‘tableauServer’][‘url’], use_server_version = True)

server.auth.sign_in(tableau_auth)

tableau_auth — setting up the user name and password and cfg is grabbing the user name from the config.ini file which you have setup during the README.md phase.

server — setting up the url to login the server.

use_server_version — to use the latest version of the REST API that is supported by the instance of the Tableau Server you are connecting to.

  • Password is saved securely using the keyring:
if not keyring.get_password(service_id, cfg[‘tableauServer’][‘user’]):
keyring.set_password(service_id, cfg[‘tableauServer’][‘user’], getpass.getpass(‘Enter Tableau Server Password For {}: ‘.format(cfg[‘tableauServer’][‘user’])))

keyring — function allows you to not have to store your passwords directly in the source code. The keyring module provides a wrapper around your system’s password store. For example, OS X Keychain or the Windows Credential Vault. So you will never have to worry about forgetting to expunge secrets or leak passwords in your source code.

if — statement searches for a password associated with the tableauServer user within the keyring. If none is found, you will be asked to input a password for the user from command line. you will only need to type the user password once as it is saved within the keyring, so any other time you run this script, it will automatically pull the password.

  • Next we iterate through each site and set up sign in credentials:
remove_punctuation_map = dict((ord(char), None) for char in '\/*?:"<>|')
for site in TSC.Pager(server.sites):
     tableau_auth1 = TSC.TableauAuth(cfg[‘tableauServer’][‘user’], tableauPassword, site_id = site.content_url)
     server.auth.sign_in(tableau_auth1)
     sPath = os.path.join(oPath, site.name.translate(remove_punctuation_map))
     if os.path.isdir(sPath) is False:
os.makedirs(sPath)

After initially signing into the Tableau server from above this for loop iterates through each site within the Tableau server.

TSC.pager — is used to get all the resources on Tableau server since the number of resources on Tableau Server can be very large, Tableau Server only returns the first 100 resources by default.

tableau_auth1 — then sets up authentication to log into that specific site within the tableau server.

sPath — sets up the directory for the site while remove_punctuation_map removes illegal characters

if — statements checks if there is a folder/directory of that site and if not it will create a folder/directory with that site name

  • Then the script will iterate through all the workbooks within that site:
req_option = TSC.RequestOptions()
req_option.filter.add(TSC.Filter(TSC.RequestOptions.Field.UpdatedAt,TSC.RequestOptions.Operator.GreaterThanOrEqual,downloadChangesSince))
for workbook in TSC.Pager(server.workbooks, req_option):
     workbookName = workbook.project_name.translate(remove_punctuation_map)
     xPath = os.path.join(sPath, workbookName)
     if os.path.isdir(xPath) is False:
os.makedirs(xPath)
     file_path = server.workbooks.download(workbook.id, filepath = xPath, no_extract = True)
     extractWorkbook(file_path, site.name, workbookName, workbook.project_id, workbook.id, os.path.basename(file_path), xPath)

First, we set up the filtering using req_option which enables us to filter based on the updated time of the workbook.

TSC.RequestOptions.Field.UpdatedAt — grabs the last updated time of the workbook, and TSC.RequestOptions.Operator.GreaterThanOrEqual compares the UpdatedAt time with the downloadChangesSince time and ignores those workbooks that have UpdateAt time less than the downloadChangesSince time.

We define the downloadChangesSince like so:

currentTime = datetime.now()
downloadChangesSince = (datetime.fromtimestamp((currentTime.timestamp() — args.incremental*3600)).replace(microsecond = 0)).isoformat() + ‘Z’
if args.full_load == True:
downloadChangesSince = ‘2000–01–01T00:00:00Z’

The time used in Tableau Server REST API is in ISO 8601 format. args.incremental and args.full_load is specified at the command line upon running this script.

Then we iterate through all the workbooks using the filtering req_options within the Tableau server site using the TSC.Pager function.

Up next, we check to see if there is a folder/directory for the project name associated with the workbook, by removing all illegal characters and checking if exists. If not, create a folder/directory with that project name.

Next, we download each workbook using server.workbooks.download by specifying the workbook.id which is the identifier for that workbook, filepath of where to download which is in its associated project folder.

no_extract is to specify downloading only .xml files without the extracts.

Finally, we extract the workbook and rename the file so that it includes the workbook name as well as its workbook id.

  • We repeat for downloading the data sources.
  • Then we back up to git if it detects any changes/updates for any workbooks or data sources.

Since we have finished backing up all the workbooks and data sources in .xml format, now you would be able to view all the files and versions when you go to your git repository.

Thank you for going along the journey with me and if you have any questions then feel free to reach out to us at contact@starschema.com!


by Chimed Altandush and Alan Tang.