Mining your Tableau logs with Apache Drill

Tamas Foldi
Aug 23 · 7 min read

Tableau Server and Desktop logs each and every action you perform. The log data is a gold mine for people eager to understand what is happening under the hood and why. However, there is no easy way to query multiple log files and filter for the relevant information only. Tools like LogShark require Windows installation and complex infrastructure and they are not even real-time. Using command line magic to parse logs (they’re in JSON format) with tools like jq isn’t really convenient as the different entries are distributed across multiple log files. So how are professionals are doing this? Let me explain.

Tableau Log file format

Exciting things are stored in Tableau’s JSON-based logging format, called native logs. Desktop, Server, Prep, and its processes dealing with data or visualizations use this format. A typical log line looks like:

{"ts":"2019-08-22T16:44:57.924","pid":60222,"tid":"4a1638","sev":"info","req":"-","sess":"-","site":"-","user":"-","k":"open-log","v":{"path":"/Users/tfoldi/Documents/My Tableau Repository/Logs/log.txt"}}

With the tool jq we can quickly visualize this structure in a more human-readable way:

[tfoldi@kompi] % head ~/Documents/My\ Tableau\ Repository/Logs/log.txt | jq                                                                                                                
{
"ts": "2019-08-22T16:44:57.924",
"pid": 60222,
"tid": "4a1638",
"sev": "info",
"req": "-",
"sess": "-",
"site": "-",
"user": "-",
"k": "open-log",
"v": {
"path": "/Users/tfoldi/Documents/My Tableau Repository/Logs/log.txt"
}
}
{
"ts": "2019-08-22T16:44:57.924",
"pid": 60222,
"tid": "4a1638",
"sev": "info",
"req": "-",
"sess": "-",
"site": "-",
"user": "-",
"k": "previous-log",
"v": {
"path": "/Users/tfoldi/Documents/My Tableau Repository/Logs/log_bk.txt"
}
}

As we can see, all log entries are JSON objects separated by new lines. All entries have ts (timestamp), pid (process id), tid (thread id inside the process), sev (severity), req (request identified), sess (Server vizql session or Desktop session), site (site in Server), user (user in Server) and log keys (k) and (v) values. In some circumstances, we have ctx for contextual information (like who called a specific service) and a for “ART Logging” which logs CPU and memory counters. These last two examples deserve a separate post, so I’ll focus on the basic JSON format.

For us, keys and values hold the information we need: what type of operation (key) produced what log message (value). The log message itself is also a nested JSON object; different keys provide different value objects in various structures.

Both Tableau Server and Desktop store information in multiple files, adding complexity. Multiple Desktop instances use multiple log files like log.txt and log_1.txt.Hyper logs its operations to hyper logs and yes, to multiple files. Complexity increases when we want to see the specific hyper log entries for a specific Desktop or Server session. We need to switch between multiple logs quickly without ingesting them in real-time into additional databases like Splunk or Logstash.

Apache Drill — SQL for everything

Drill is a “Schema-free SQL Query Engine for Hadoop, NoSQL, and Cloud Storage” and SQL engine that allows us to SQL query everything, loosely defined JSON structures in particular. Its installation is fairly easy on every operating system as it is written in pure java. Let’s see how can we use it on Tableau Desktop and Server.

Apache Drill on OSX

The installation is straightforward, we just need homebrew to install drill:

$ brew install apache-drill

We just need to start the standalone version:

[tfoldi@kompi]% drill-embedded                                                                                                                                       
Apache Drill 1.16.0
"A Drill is a terrible thing to waste."
apache drill>

First, we need to tell Drill where the logfiles are. Drill comes with a configuration UI by default. We can graphically set the storage configuration from there.

  1. Navigate to http://localhost:8047/storage
  2. Click on DFS’s “UPDATE” button
  3. Add the following section to the configuration JSON object:
"tableau": {
"location": "/Users/tfoldi/Documents/My Tableau Repository/Logs",
"writable": false,
"defaultInputFormat": "json",
"allowAccessOutsideWorkspace": false
},
This how it should look like after adding Tableau Desktop’s Log directory as “tableau” workspace.

We should also set two system settings at http://localhost:8047/options as well:

exec.enable_union_type=true
store.json.all_text_mode=true

Enable union type will allow Drill to dynamically change data type for the v column while disabling number conversions speed things up. Finally, let’s try it:

Executing our first query on the log.txt flawlessly.

We can also go inside the v JSON object and directly query its attributes. Also, selecting records from multiple files is easy. Here, we select the ten longest-running queries using tabprotocol.

apache drill (dfs.tableau)> select ts,k,v['elapsed'],v['protocol-class'] from `log*.txt` where k ='end-protocol.query'  order by 3 desc limit 10;
+-------------------------+--------------------+--------+----------+
| ts | k | EXPR$2 | EXPR$3 |
+-------------------------+--------------------+--------+----------+
| 2019-07-31T06:24:19.975 | end-protocol.query | 15.713 | postgres |
| 2019-07-31T06:24:20.306 | end-protocol.query | 14.78 | postgres |
| 2019-07-31T06:24:20.275 | end-protocol.query | 14.769 | postgres |
| 2019-07-31T06:24:20.252 | end-protocol.query | 14.718 | postgres |
| 2019-07-31T06:24:20.241 | end-protocol.query | 14.715 | postgres |
| 2019-07-31T06:24:20.219 | end-protocol.query | 14.692 | postgres |
| 2019-07-31T06:24:20.241 | end-protocol.query | 14.626 | postgres |
| 2019-05-21T14:36:22.406 | end-protocol.query | 14.124 | bigquery |
| 2019-07-31T06:25:30.299 | end-protocol.query | 7.538 | postgres |
| 2019-07-31T06:25:30.541 | end-protocol.query | 7.272 | postgres |
+-------------------------+--------------------+--------+----------+
10 rows selected (0.788 seconds)

Now comes the fun part. I want to see all tabproto logs related to that to one of my bigquery session:

Drill from vizql logs to tabproto by process id and protocol id

Without ingesting anything, defining a schema or any hard work, we were able to drill from one log file to another, by using protocol id and Desktop process id. Anyone who has spent time with Tableau log forensics will appreciate this method.

Tableau Server and Drill

Things get serious when performing the same exercise on Tableau Server; more processes, more directories, and obviously more logs.

First, download the Drill package from https://drill.apache.org/download/ and make sure you have JDK installed on the server.

This time, instead of using the UI to add a storage configuration we will register our own plugin as tableau in conf/storage-plugins-override.conf.

"storage": {
tableau: {
"type": "file",
"connection": "file:///",
"workspaces": {
"logs": {
"location": "/var/opt/tableau/tableau_server/data/tabsvc/logs/",
"writable": false,
"defaultInputFormat": "json",
"allowAccessOutsideWorkspace": true
}
}
"enabled": true
}

Don’t forget to set up union and all text mode in conf/drill-override.conf :

drill.exec: {
cluster-id: "drillbits1",
zk.connect: "localhost:2181",
enable_union_type: "true"
}

We are ready with the configuration. After starting the Drill shell, we can immediately switch to this new workspace:

apache drill> use tableau.logs;
+------+------------------------------------------+
| ok | summary |
+------+------------------------------------------+
| true | Default schema changed to [tableau.logs] |
+------+------------------------------------------+
1 row selected (1.022 seconds)

How about reading some vizql logs? Works like a charm.

We can query every Tableau Server logs — no ingestion required, everything is in real-time

Let’s check it, like which Dashboard open event required the most CPU resource?

First, we should find the k value for end bootstrap (boostrap is vizql command for opening a visualization):

apache drill (tableau.logs)> select  k,count(*) from `vizqlserver/nativeapi_vizqlserver*` where k like 'end%boot%' group by k;
+-------------------------------------------------+--------+
| k | EXPR$1 |
+-------------------------------------------------+--------+
| end-bootstrap-session-action.bootstrap-session | 102 |
| end-vizql-worker.get-or-update-cached-bootstrap | 109 |
| end-bootstrap-response-cache.load | 70 |
| end-vql-session.bootstrap | 1 |
| end-vizql-worker.bootstrap-session | 7 |
+-------------------------------------------------+--------+
5 rows selected (3.948 seconds)

Please note how easy it is to aggregate values inside logs — again, without loading it to any kind of database. Now select the top CPU usage across sessions:

apache drill (tableau.logs)> select  sess,a['res']['ucpu']['i'] from `vizqlserver/nativeapi_vizqlserver*` where k = 'end-bootstrap-session-action.bootstrap-session' order by 2 desc limit 5;

Well, it seems my top workbooks require about 1300ms user CPU time. But what are these sessions? Yes, you can simply select for this vizql_session_id in the logs and see what exactly happened, what were the most CPU intensive steps in rendering:

apache drill (tableau.logs)> select k,a['res']['ucpu']['i'],a['res']['ucpu']['e'] from `vizqlserver/nativeapi_vizqlserver*` where sess  ='62E262A8CD1547AB9B720FB376E0F101-1:0' order by 3 desc nulls last limit 10;
+-----------------------------------------+--------+--------+
| k | EXPR$1 | EXPR$2 |
+-----------------------------------------+--------+--------+
| end-pres-model.build-application | 361 | 356 |
| end-dashboard-image.compute-zone-layout | 904 | 83 |
| end-protocol.create-protocol | 75 | 75 |
| end-protocol.create-protocol | 74 | 74 |
| end-protocol.create-protocol | 71 | 71 |
| end-protocol.create-protocol | 71 | 71 |
| end-protocol.create-protocol | 69 | 69 |
| end-protocol.create-protocol | 66 | 66 |
| end-protocol.create-protocol | 66 | 66 |
| end-protocol.create-protocol | 65 | 65 |
+-----------------------------------------+--------+--------+
10 rows selected (4.778 seconds)

From here, we can make assumptions like we have tons of data queries and zones to render.

Tableau Server Cluster with Distributed Drill Service

Well, this deserves a different post but here is the short version. Drill has a distributed service called Drillbit that can run queries across multiple nodes, slurping logfiles from all the workers. It does not require anything other than drill and zookeeper. Usually, I use Tableau’s own zookeeper as a coordination service for Drill.

Summary

Using SQL to traverse between logs and parse nested JSONs is just simply cool. With tools like Apache Drill or Dremio and small overhead, administrators and developers can easily dig into the Tableau Logs — on their Desktop and on their Servers.


If you have any questions or comments, or interested in Tableau Server forensics or performance optimization, just drop a line.

Starschema Blog

Data contains intelligence that can change the world — we help people discover, manage and use this intelligence.

Tamas Foldi

Written by

Starschema Blog

Data contains intelligence that can change the world — we help people discover, manage and use this intelligence.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade