Creating unusual graphs in Tableau with the Graphviz flowchart algorithm

Sometimes it’s the tiny quirks of a tool’s algorithm that makes data visualization truly unique

Tanya Lomskaya
26 min readJul 12, 2024

Ever since I was able to build a beutiful directed network graph in Tableau using the Graphviz package, I’ve been wanting to delve deeper into the study of this tool and see what other visualizations it allows you to create. Let me show you what came out of my exploration.

If you’ve used network building tools like NetworkX or Gephi, the name Graphviz may be familiar to you as they both offer this package’s layouts. This visualization tool itself is quite old; it was developed by AT&T back in the 1990s. Nowadays, it’s open source and licensed under the Eclipse Public License.

Graphviz builds two types of graphs: flowcharts and networks. And my speculation is that, due to its focus on flowcharts, this tool exhibits unique qualities difficult to find in other similar apps:

  • The links start and end strictly at the node edges;
  • There is a variety of link designs;
  • The link coordinates produced by the app are sufficient to accurately reproduce their design outside of Graphviz.

The pictures show different types of graphs that I managed to draw in Tableau. You can also study them live on the Tableau Public dashboard.

Directed and undirected networks with different types of edges / A flowchart / Node chains built with regard to node diameters / Matrices with various arbitrary paths superimposed on them.

Some of these graph types are Graphviz-specific; others are not, but they got me interested in the original creation mechanism. In this post, I will discuss them all. Each graph has its own creation features within Graphviz, but the algorithm for extracting data and transferring it to Tableau is the same for all of them.

Therefore, I’ll dive into the process of creating one of them — a directed network graph — from Graphviz to a Tableau dashboard. After that, I’ll go into detail about the specifics of building the rest in Graphviz.

This is the graph we are going to reproduce:

We will draw a directed network graph with curved edges.

Tools

You can install the Graphviz tool set on your computer, or you can do without it. For example, I make do with just the online playgrounds and, in specific cases, use the program’s Python extension, PyGraphviz. In this tutorial, I will use two online playgrounds:

  • Graphviz.org Playground allows you to edit graphs in real time. Here, you can either create a graph from scratch or edit any graph from the website gallery. The Playground allows you to save the code page as a URL, which you can reopen at any time and continue editing.
  • Graphviz Online allows you to save the created graph in JSON format.

I will use Python to extract and process JSON data and Tableau to design the finished graph and display it on a dashboard.

Data

We will create a network of top 20 avocado trading countries in 2023, connecting each of them with its two largest avocado suppliers in the top 20. For this I prepared two tables: the node table and the edge table. The node table contains the country name, the ISO code that we’ll use as an ID, the amount of avocado trade in dollars in 2023, and the node size calculated on this amount’s basis.

Node data table (view on GitHub)

The size is designed to narrow the large difference in trade volumes between the leaders and countries from the tail of the top 20; otherwise, small players will not be visible on the chart. The size of the largest country is 60, and the size of small players does not fall below 0.15.

The edge table connects each top 20 country to its two largest suppliers from the top 20.

Edge data table (view on GitHub)

We will color the country nodes according to the UN region they belong to; edges will be colored according to their source node colors.

Preparing data for Graphviz: fixing the attribute mismatch

In Tableau, we can put our size parameter from the avocado table directly into the size field, and the program will set it as the area of the corresponding nodes.

Not so with Graphviz; the tool doesn’t have a size option at all. It calculates the node size based on the width and/or height we specify, that is, in the case of round nodes, based on their diameters.

Remembering that the sizes from the node table stand for nodes’ areas, we’ll extract the diameters from these sizes and load them into Graphviz as widths so that the tool will correctly calculate the node and edge coordinates. In Tableau, we will set the node sizes using the sizes from our original avocado table.

This is how we extract the diameter from the size indicator in Python:

import pandas as pd
import numpy as np
nodes = pd.read_csv('node_data.csv', index_col=0)

# diameter = sqrt ( size / pi ) * 2

nodes["diameter"] = [
np.sqrt(x / np.pi) * 2 for x in nodes["size"].tolist()
]

Now we have all the parameters for building a graph in Graphviz.

DOT language

Graphviz has its own graph description language called DOT. It’s simple and intuitive, and we will study its key concepts right here and now. Let’s open the Graphviz.org playground. In the left block, we will write our code, while the right one will display the graph.

Starting a new graph

We can build directed graphs (connected by arrows) and undirected ones (connected by simple lines). This is what the simplest directed graph looks like. We begin a new graph by the digraph keyword and describe its edges and nodes within curly braces. Copy this code to the left block of the playground:

digraph GraphName {
a -> b
}

Here’s what I got:

digraph GraphName {
 a -> b
 }
Click on the image to open the graph in the playground.

The syntax is identical for undirected graphs, except that we begin them with the graph keyword and use a double-hyphen -- instead of an arrow -> to show node relationships. If you want to see how the graph changes, replace the previous playground code with this new one:

graph GraphName {
a -- b
}

Designing the graph

We define the appearance of a graph by assigning attributes to the graph itself, its nodes, and its edges. To do that, place an attribute–value pair in square brackets [ ] after a stated element or class. Here I define the layout type, the node shape, and the link color for the whole graph:

digraph GraphName {
graph [layout = fdp]
node [shape = circle]
edge [color = "#FF0000"]

a -> b
b -> c
c -> a
}

…And here I assign individual width and color to a node a and a link c -> a, respectively:

digraph GraphName {
graph [layout = fdp]
node [shape = circle]
edge [color = "#FF0000"]

a -> b
b -> c
c -> a [color= "#0000FF"]

a [width=1]
}

This is what my graph looks like now:

digraph GraphName { graph [layout = fdp] node [shape = circle] edge [color = “#FF0000”] a -> b b -> c c -> a [color= “#0000FF”] a [width=1] }
Click on the image to open the graph in the playground.

Here are the links to complete attribute lists:

Graphviz attributes for Tableau graphs

A lot of Graphviz’s attributes will be useful if you want to create a ready-made graph in Graphviz and download it as an image. Here I will only list those we need to craft a graph for Tableau to do all the designs there.

Graph attributes

Layout

The layout determines how the nodes of the graph are arranged. The dot layout draws hierarchical graphs, aka flowcharts. The neato and fdp are both ‘spring layouts’, the former is said to work better with a small number of nodes (up to a hundred), and the latter with large and clustered graphs. The twopi layout places nodes in concentric circles based on their distance from a root node. The circo layout arranges nodes in a circle. All types of possible layouts are listed here.

Graphviz layouts. Explore the full list here.

Splines

The splines attribute defines the edges’ looks. By default, they are straight (splines = false). When the splines is set to true, the edges will be straight but curve when a node meets a road. For all the edges to be curved, set splines to curved. And if you want the edges to be x- and y-axes-aligned polylines, define the attribute as ortho.

In the picture, you can see the examples of all these edge types.

Graphviz edge types. Explore more here.

Node attributes

Fixedsize

By default fixedsize = false, each node’s size exceeds the area needed for its label. When fixedsize = true, the width and height values are enforced.

Width / Height

For regular shapes, it’s enough to define just a width or a height.

While working in a playground, if the edges seem to be too thin and the labels look too small in size, then the node diameters are very large. Divide them by some number to make all the graph’s elements visible.

Shape

The default node shape is an ellipse. All possible shapes are listed here.

Label

The default node label is the name we use to refer to the node in our code. Change it using this attribute. The label can be multi-line: line breaks are indicated by \n, \l or \r depending on the desired alignment (centered, left or right, respectively).

Edge attributes

Minlen (dot layout) / Len (other layout types)

Changing the arbitrary edge length helps adjust the appearance of the graph.

Custom attributes

In addition to Graphviz attributes, we can specify as many of our own attributes in square brackets as we want. Their names should just not mimic the names of standard Graphviz attributes. They will not affect the appearance of the graph, but will be included in the graph’s output data.

This knowledge is already enough for us to start building graphs.

Building a network graph in Graphviz

Let’s build a directed network graph with curved edges.

As you can see below, my network graph code is very simple. First, I set the general attributes: fdp layout (neato is also possible, but I liked the fdp result better), curved lines, round shape and label-independent size for nodes, and 50-inch length for edges. Then I list all edges and nodes with their individual widths (diameters) and labels:

digraph avocado_trade {
graph [
layout = fdp
splines = curved
]

node [
shape = circle
fixedsize = True
style = filled
color = "#00000055"
]

edge [
len = 50
]

MX -> CA
US -> CA
CL -> CN
MX -> CL
PE -> CL
CL -> GB
PE -> CN
PE -> CO
CO -> GB
FR -> IT
NL -> FR
ES -> FR
NL -> DE
PE -> DE
IL -> RU
ZA -> IL
IL -> GB
NL -> IT
MX -> JP
PE -> JP
KE -> MA
KE -> ZA
KE -> GB
MX -> GB
MX -> US
ES -> MA
PE -> NL
NL -> PL
NL -> RU
ZA -> NL
PE -> PL
PE -> ES
GB -> PE
PE -> US
ZA -> ES

US [width = 2.185, label = "USA"]
MX [width = 2.052, label = "Mexico"]
NL [width = 1.827, label = "Netherlands"]
PE [width = 1.48, label = "Peru"]
ES [width = 1.202, label = "Spain"]
FR [width = 1.008, label = "France"]
DE [width = 0.93, label = "Germany"]
CL [width = 0.712, label = "Chile"]
CA [width = 0.691, label = "Canada"]
GB [width = 0.674, label = "UK"]
IL [width = 0.622, label = "Israel"]
MA [width = 0.573, label = "Morocco"]
CO [width = 0.549, label = "Colombia"]
JP [width = 0.487, label = "Japan"]
ZA [width = 0.476, label = "South Africa"]
CN [width = 0.474, label = "China"]
IT [width = 0.468, label = "Italy"]
PL [width = 0.458, label = "Poland"]
KE [width = 0.457, label = "Kenya"]
RU [width = 0.43, label = "Russia"]
}

To see what the graph looks like at the moment, view it in the playground. I have a couple of custom attributes to add that we’ll use later in Tableau: each edge’s source country region and each node’s region and size. We will then find them in the graph’s JSON dictionary using the region and size keys, respectively.

This is what my final graph looks like (I added node coloring to make the view more fascinating). Click on the image to open it in the playground. If you want to see how this graph would look with a different edge type, just change the splines indicator.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/DirectedCurvedGraph.txt
Click on the image to open the graph in the playground.

Extracting graph data in JSON format

Once you are satisfied with the appearance of the graph, copy its code and transfer it to Graphviz Online. This tool can also be used to build a graph, but the best feature about it is the possibility to transmit the graph from DOT to JSON format. As in the previous playground, the code will be on the left, and the graph will appear on the right.

Select JSON from the format menu and copy (unfortunately, this has to be done manually) the code that has replaced the graph image.

Select JSON from the format menu and copy the code.

Copy the code to a text file. Before saving, please check all the true / false values under the directed and strict keys at the top sector of your generated dictionary to be enclosed in quotes to avoid subsequent file reading errors. Save the file on your disk as JSON.

Processing graph data

We have a JSON file with our graph data. All we need to do now is extract this data, knowing the corresponding keys, and compose a dataframe from it. Here I will tell you how I do this in Python, but you can use any convenient tool of your choice.

The packages we need are Pandas, NumPy, and JSON, as well as the BPoly tool from the SciPy library, to work with our curved edges.

import pandas as pd
import numpy as np
import json
from scipy.interpolate import BPoly

Let’s open the JSON file as a dictionary and examine its keys:

with open("AvocadoGraph.json") as f:
json_data = json.load(f)

json_data.keys()

We need three keys: objects, edges, and bb. The objects section contains a list of node mini-dictionaries, one for each node, with all the corresponding attributes. The edges section contains a list of edge mini-dictionaries. And under the bb key you’ll find the graph’s bounding box coordinates (xmin, ymin, xmax, ymax) in a string format.

We will create a node table and a link table and then concatenate them into one. The bb parameter just needs to be remembered at this stage.

Processing node data

Here’s how I create the node data table: I create an empty list for each attribute of interest, iterate through the node mini-dictionaries, find the attributes, and add them to the corresponding lists. Then I compose the resulting lists into a dataframe. The indicators to retrieve are: a unique id _gvid assigned by Graphviz, a name name that we used in our Graphviz code (the country ISO code), our additional attributes — region and size — under corresponding keys, and node coordinates pos in string format. The pos indicator is split into x and y. In the end, I add a type column with the specified type ‘node’:

ids = []
iso_codes = []
countries = []
sizes = []
regions = []
xs = []
ys = []

for node in json_data["objects"]:
ids.append(node["_gvid"])
iso_codes.append(node["name"])
countries.append(node["label"])
sizes.append(node["size"])
regions.append(node["region"])
xs.append(float(node["pos"].split(",")[0]))
ys.append(float(node["pos"].split(",")[1]))

nodes = pd.DataFrame([ids, iso_codes, countries, sizes, regions, xs, ys]).T
nodes.columns = ["id", "iso_code", "country", "size", "un_region", "x", "y"]

nodes["type"] = "node"

nodes.head()

You can find the full JSON proceeding code on GitHub.

Here’s the table I got:

Processing edge data

From the edge mini-dictionary list we extract unique ids _gvid, source node ids tail, target node ids head, edge coordinates pos, and our custom attribute, region.

Graphviz outputs edge coordinates as a series of points separated by spaces in a single string variable. This string variable looks different depending on the graph type. Here is the pos variable of an undirected graph edge:

"424.98,715.49 444.36,758.21 451.23,773.5 461.39,796.43"

And here is the absolutely identical edge, but directed. We are given five points instead of four; the coordinates have a preceding ‘e’; finally, the last and first point of the undirected curve here became the first and second, respectively:

"e,461.39,796.43 424.98,715.49 441.64,752.21 449.05,768.66 457.25,787.11"

The point here is that when drawing a directed graph, Graphviz first designates the ‘head’ point of the arrow, then the ‘tail’ one, and then adds the points lying between them. If we pass Tableau points in this order, it will draw not a line but something like a loop. To prevent this, we need to throw the first point to the end of the coordinate list when the graph is directed. Then the line path will go from the edge’s tail to its head.

Curved edges

For splines = curved, Graphviz doesn’t give us the points of the curve itself, but a set of control points for drawing the Bezier curve. What these points are is well shown in the examples here. That is, before loading into Tableau, we need to restore the curve using the control points given to us. In Python, the BPoly tool from the SciPy library does a great job of this.

You can see an example of use in the picture: on the left is the original Graphviz curve, and on the right is the curve reconstructed from control points using BPoly. Obviously, they are almost identical.

This is what my code for curved edge data extraction looks like:

ids = []
source_ids = []
target_ids = []
regions = []
xs = []
ys = []

for edge in json_data["edges"]:
ids.append(edge["_gvid"])
source_ids.append(edge["tail"])
target_ids.append(edge["head"])
regions.append(edge["region"])

# PROCESSING THE EXTRACTED POS VARIABLE (1–6):
# 1a - getting rid of the preceding 'e' (for a DIRECTED graph)
# 1b — creating a control point array from a string
array = [
tuple([float(coord.split(",")[0]),
float(coord.split(",")[1])])
for coord in edge["pos"][2:].split(" ")
]

# 2 — moving the first array point to the end (for a DIRECTED graph)
cp = np.array(array[1:] + array[:1])

# 3 — defining the curve based on control points in the range [0, 1]
curve = BPoly(cp[:, None, :], [0, 1])

# 4 — generating 50 equidistant points in the range from 0 to 1
x = np.linspace(0, 1, 50)

# 5 — getting the list of curve points
p = curve(x)

# 6 — extracting point xs and ys
xs.append(p.T[0])
ys.append(p.T[1])

edges = pd.DataFrame([ids, source_ids, target_ids, regions, xs, ys]).T
edges.columns = ["id", "source_id", "target_id", "un_region", "x", "y"]

Non-curved edges

Working with other types of edges is much simpler: we turn the string into a point array and extract the x and y coordinates from it. Several points are enough to draw straight or ortho edges. As for splines = true, the program will give us more points — as many as necessary to draw the edge with all its bends.

This is my code for other-than-curved edge data extraction:

ids = []
source_ids = []
target_ids = []
regions = []
xs = []
ys = []

for edge in json_data["edges"]:
ids.append(edge["_gvid"])
source_ids.append(edge["tail"])
target_ids.append(edge["head"])
regions.append(edge["region"])

# PROCESSING THE EXTRACTED POS VARIABLE (1–6):
# 1a - getting rid of the preceding 'e' (for a DIRECTED graph)
# 1b - creating a control point array from a string
array = [
tuple([float(coord.split(",")[0]),
float(coord.split(",")[1])])
for coord in edge[‘pos’][2:].split(" ")
]

# 2 - moving the first array point to the end (for a DIRECTED graph)
points = np.array(array[1:] + array[:1])

# 3 - extracting point xs and ys
xs.append([p[0] for p in points])
ys.append([p[1] for p in points])

edges = pd.DataFrame([ids, source_ids, target_ids, regions, xs, ys]).T
edges.columns = ["id", "source_id", "target_id", "un_region", "x", "y"]

When splines = true, we can apply both this code and the curve-proceeding one. In the second case, the edges going around the nodes will be smoother.

Finishing edge processing

This is what our edge table looks like now. Each row describes one edge; the x and y columns are lists.

We need to add one more list to each edge’s data, which will define a path for its coordinates. We also add each edge’s source and target countries. Our list-containing cells are then split into separate rows using the explode() command. In the end, we specify the type ‘edge' in the added type column.

edges["country_source"] = edges["source_id"].map(
nodes.set_index("id")["country"].to_dict())
edges["country_target"] = edges["target_id"’].map(
nodes.set_index("id")["country"].to_dict())

edges = edges.drop(["source_id", "target_id"], axis=1)

edges["path"] = [
np.arange(len(x)) for x in edges["x"]
]

edges = edges.explode(["x", "y", "path"])

edges["type"] = "edge"

edges.head()

Here’s our edge table:

Now concatenate the node and edge tables and save the resulting file:

data = pd.concat([nodes, edges], ignore_index=True) 
data.to_csv("network.csv")

Bounding box value

Our last parameter to extract is a string under the key bb, '0,0,545,916'. Since our edges connect not the centers but the boundaries of the nodes, we need their starting and ending points to fall exactly on those boundaries. To do this, our Tableau graph must have exactly the same length-to-width ratio as the original Graphviz graph. So save or remember this string’s value.

Same steps using Python extension

We can build our graph from scratch and prepare it for Tableau without leaving the Jupyter notebook, using PyGraphviz. The corresponding code can be found at the link.

I personally prefer playgrounds because you can see immediately what is happening with your graph. The Python extension, however, can be useful when we need a fairly strict width-to-height ratio — for example, a 1:1 — to place the graph on a dashboard. The fdp layout allows many versions of one graph, and with each reboot, the graph is recomposed in a new way. And Graphviz decides on its own how to place the graph; this cannot be predicted in advance. Using PyGraphviz, we can put this process in a loop: the tool will generate layouts until one of them ‘fits’ within the specified width-to-height ratio.

Tableau: drawing the graph

There are two ways to draw a network graph in Tableau. If you want to create a simple graph consisting of two layers — nodes and edges — then creating a dual axis is enough to achieve your goal. However, if you want to make a multilayer graph, for example, by overlaying different mark types on top of each other, then you should use map layers.

I will be using map layers for this graph. That is, we will plot our nodes and edges layer by layer on a pseudo-map, which we will then ‘pull out’ from under the resulting ‘pie.’

To do this, we open our CSV file in Tableau Public, making sure (on the Data Source page) that the field separator is a comma. Now go to the sheet page.

We’ll represent our x and y as longitude and latitude, respectively. Let me remind you that the maximums of our axes are at 545 (x) and 916 (y). Such coordinates, of course, do not exist. To get around this problem, we’ll divide each x by 545 and each y by 916. Thus, all our X’s and Y’s will be in the range from 0 to 1 and can be placed on a map.

Add the following calculated fields:

Now we’ll create map points from our normalized X’s and Y’s. We need to divide them into groups: a group of node points and a group of edge points:

Double-click on the created NODES field. We have just created a map; you can see the generated latitude and longitude in the rows and columns, respectively. On the Marks menu, change the graph type from Automatic to Shape (all nodes will shrink to one point), then place the id on the Detail card, the size on the Size card, and the un_region on Color. Increase the size of the nodes using the Size card’s slider.

Hide the ‘nulls’ from the lower right corner.

Set the colors. My palette for this graph is:

  • Africa: #a71627;
  • Americas: #7c6195;
  • Asia: #f4ba1a;
  • Europe: #099dd4.
Node layer (top)

Now, take the created EDGES field and drag it onto the graph. You will see a pop-up command: Add a Marks Layer. Place the EDGES field on this rectangle. Select the Line graph type from the Marks menu. Add the id to Detail and the path to Path (and transform it into Dimension). Id divides our points into groups (lines), and path indicates in what order to connect the points of each group.

To color the edges, add un_region to the Color card. To show direction, I increase the width of the edge from the tail to the head. You can do this easily by moving the path to the Size card as it goes sequentially from 0 to 49. After that, adjust the line width using the slider.

+ Edge layer

Surely you noticed that the edges’ ends do not match the node boundaries. To make them match, we need to adjust the x/y ratio. The map will get in the way here because it distorts the proportions, so remove it with the menu commands Map > Background Maps > None.

Hide all formatting lines by right-clicking on the graph > Format > Format Borders and Format > Format Lines. Then in both x- and y-axes, do: right-click > Edit Axis > Custom > Fixed Start = 0, Fixed End = 1. Then right-click and uncheck Show Header on both axes.

Start a new dashboard of the size you need. Transfer the graph to the dashboard, make it floating, hide the title, and remove all the legends. In the Layout section, set the graph width you need — for example, 350 — and calculate the height based on our height/width ratio from bb: 350*1.68=588. Set the graph height to 588.

On the graph page, increase the size of the nodes so that on the dashboard they match the edge ends.

When creating a graph with map layers, in order for the nodes to be above the edges, the NODES tab should be above the EDGES tab in the Marks menu. So drag the NODES tab to the top.

Current dashboard view of the graph

Grab the NODES field again and drag it onto the graph as a new layer. Set the graph type to circle, add the id field to Detail, and the size to Size. Our nodes are now covered with gray circles. Change their color to white. In the Marks, place the white NODES tab below the colored NODES tab but above the EDGES tab. See how this affected the graph on the dashboard: the ends of the edges that crossed the node boundaries are now hidden behind their white backgrounds.

+ Node layer (bottom)

Drag the NODES to the graph again. Set the graph type to Circle and add id and size fields. Now drag the un_region field to Color, and in the Color, drag the Opacity slider to 70%. In the Marks, drag this layer below the top NODES layer but above all the other layers.

Using the same logic, you can add labels to the nodes by selecting Text graph type and adding the country field to the appeared Text card.

I’ll leave mine unlabeled. Here’s what I got:

+ Node layer (middle)

Other graph types to try

Let’s now look at the other graphs from my set. The principle of transferring them to Tableau is the same, so I’ll focus on their Graphviz construction.

Node Chains

The dot layout’s rankdir attribute allows us to place a set of nodes of different sizes on a straight line at equal distances from each other’s boundaries. For example, here, I arrange my nodes in alphabetical order from left to right (rankdir = LR) and right to left (rankdir = RL):

The nodes are located at equal distances from the boundaries, not from each other’s centers.

You can also arrange them vertically from top to bottom (rankdir = TB), or from bottom to top (rankdir = BT).

We can make any element invisible by setting its style attribute to invisible mode. I applied style = invis to my edge chain so that only nodes are visible. This attribute is also convenient for us to transfer to Tableau: when processing the graph data, we can simply filter out all mini-dictionaries containing invis under the style key.

Here’s what the code for this graph looks like:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/NodeChain.txt
Click on the image to open the graph in the playground.

Flowcharts

A Flowchart (the original)

After saying all this, it would be strange not to try to create at least one flowchart. But first, let’s explore a few more important features of Graphviz:

  • Once we have assigned an attribute to a node, edge, or graph, any object of the corresponding type defined subsequently will inherit this attribute value. This holds until we set the attribute to a new value. If some elements of the given type were defined before any value assignment, they’ll have this attribute’s value as an empty string. To see how this works, examine the following code in the playground:
View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/FlowchartPrep1.txt
Click on the image to open the graph in the playground.
  • We can specify a different value to a group of nodes (subgraph) while keeping the default value for the rest. See how this code differs from the previous one:
View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/FlowchartPrep2.txt
Click on the image to open the graph in the playground.
  • The edges of a directed graph can be made undirected using the dir indicator. To do this, set dir = none. To make the edges directed again, set dir = forward:
View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/FlowchartPrep3.txt
Click on the image to open the graph in the playground.

Now it should be easy for you to understand the logic behind the flowchart code that I provide below. I use dot layout and ortho splines. I then list all the nodes and all the edges, grouped by their attributes.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/Flowchart.txt
Click on the image to open the graph in the playground.

There are also some Tableau features you need to know to create a flowchart:

  • Transferring the arrows

It is easy to transfer edges with arrows to Tableau only if these arrows point strictly vertically or horizontally. In this case, you can use standard triangular shapes whose vertices point in the corresponding directions. The most convenient way to achieve this is with the ortho setting, as it creates strictly vertical or horizontal lines. For example, in my flowchart, I was able to simply use a downward arrow shape throughout my Tableau graph.

  • Transferring the different-sized boxes

When preparing a flowchart for Tableau, don’t forget about how the software renders custom shapes. To be displayed adequately, they must be the same length and width. Therefore, if you make flowchart boxes of different widths, lengths and shapes, you will have to overlay the created shapes on transparent backgrounds of the same size before uploading them to Tableau.

  • Extracting Graphviz shapes to use in Tableau

You can extract and format the shapes of the created flowchart using the Inkscape package; for processing in this package, the graph must be downloaded in SVG format.

Matrices

We can also create a network graph in the form of a matrix, on which we can draw some arbitrary path. When creating it, you will need to take into account one more feature of the program.

Matrices + Arbitrary Paths

Let’s look at how Graphviz draws multiple connections between nodes in the dot layout. Draw a simple graph with one red edge:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep1.txt
Click on the image to open the graph in the playground.

A single edge line is straight. Now let’s add the second edge, again from a to b, this time green:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep2.txt
Click on the image to open the graph in the playground.

See? Both edges have the same weight, so the program will make two slight arcs out of them. Now let’s add one more edge, blue:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep3.txt
Click on the image to open the graph in the playground.

We see that the second edge has shifted to the center, and the third has formed an arc opposite to the arc of the first edge. Let’s remember this order.

Now let’s create a simple 4x4 matrix. To do this, we first list all its vertical edge chains. We then enumerate all the horizontal edge chains and fix each one at the same y coordinate by enclosing it in curly braces and assigning rank = same. We can then adjust the width and height of the matrix using the minlen indicator.

This is the code I got:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep4.txt
Click on the image to open the graph in the playground.

Now let’s impose an arbitrary path on the matrix. For the path, increase the thickness of the lines using the penwidth attribute, and set the constraint indicator to false. The default constraint is true, which means that the edge affects the rank of the nodes it connects. We want to preserve the ranks already assigned in the matrix.

Before listing the edges, assign colors to the group of nodes through which our path will pass.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep5.txt
Click on the image to open the graph in the playground.

You can see that we now have some double edges, and in accordance with the principle that I explained above, such edges shifted and turned into arcs.

But we also remember that if we add a third edge, the second one will straighten out! Copy the matrix code and add it after the arbitrary path code. Reset the previously set line thickness by adding penwidth = "" for the matrix’s edges. All of our colored edges should now straighten out.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep6.txt
Click on the image to open the graph in the playground.

We don’t need any matrix edges to be shown, and we will hide them by setting their style attribute to invisible mode. As you should remember, attributes declared for a class persist until we reassign them. So set style = invis for the first matrix, reset it for the arbitrary path by setting style = "", and then, when declaring the edges of the second matrix, set style = invis again.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep7.txt
Click on the image to open the graph in the playground.

Now you can easily build a graph like this one, simulating the new NYT game Strands.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/NYTStrands.txt
Click the image to open the graph in the playground.

If you want a bump chart, you can handle the duplicate edges problem more easily using the strict attribute. This attribute simply prohibits duplicate edges. It is indicated when declaring the graph. To see how it works, add strict before our training graph declaration and see what changes:

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/MatrixPrep8.txt
Click the image to open the graph in the playground.

As you can see, a horizontal edge in our path has disappeared. This happens because in a strict graph, if an edge is already drawn (an edge of our first matrix), we cannot put a new one on top of it.

I used this property when drawing the below graph of university rankings. The path of a university is visible only when its rank changes compared to the previous year; when standing still, it is only shown as a colored node. This allows us to focus the viewer’s attention on rating changes.

View the code on GitHub: https://github.com/lomska/Tableau-Preps/blob/main/Graphviz%20Tutorial/BumpChart.txt
Click the image to open the graph in the playground.

This concludes my Graphviz test drive for Tableau. I couldn’t discuss all the possibilities I discovered in one post. And since I only recently became acquainted with the tool, there are probably many that I have not yet discovered.

I hope I can inspire you to experiment further. I just have to warn you that once you understand the basic principles of how the program works, these experiments on playgrounds begin to get a bit addictive.

Graphviz + Tableau graphs

Additional reading and video

In Graphviz Gallery, you can select any graph, click ‘Edit in Playground’ and study the code. The coolest thing is that you can change something right in the code and immediately see how it will affect the graph.

This DOT Language manual touches on some important points that I didn’t cover.

Drawing Graphs With Dot by Emden R. Gansner, Eleftherios Koutsofios and Stephen North offers a very useful and in-depth look at DOT. The authors pay special attention to flowchart creation.

In this Tableau Map Layers Tutorial, Tableau Visionary Luke Stanke takes an in-depth look at map layers and how they can be used to create layered charts.

And here’s a great collection of examples of using map layers in Tableau from Tableau Public Ambassador Jennifer Dawes.

Happy vizzing!

--

--