A Game of Data Visualizations: Making Alluvial Diagrams Without Code
Like most of you, I am very excited for the upcoming final season of Game of Thrones. In addition to the fantasy elements, my belief is that the show is so popular because it’s basically Braveheart multiplied by The Godfather. The latter framework has especially been on my mind recently while catching up on old episodes. Where Season 1 opened with an uneasy peace, Game of Throne’s best dynamics have unfolded via intrigue and conflict between the show’s powerful, dynastic families, as well as numerous other military, religious, and other political groups. It’s been great to watch.
Simultaneously, I’ve lately done a ton of data and process visualization for my work and other projects. One of my favorite formats has been the Alluvial diagram, commonly used to depict network changes. Accordingly, I thought this would be an excellent opportunity to visualize GoT’s tremendously complex network of (shifting) character affiliations, while also showcasing two of my favorite visualization and diagramming tools: RAWGraphs and Lucidcharts.
I’ve written several other how-to articles before, mostly involving use of the R programming language. Those all had quite a bit of custom code, requiring detailed step-by-step walkthroughs. Here though, both RAWGraphs and Lucidcharts are extremely easy to use, with their own set of tutorials that would be pointless to re-create (I’ll link to them below). Thus for this topic, I’ll instead try to give more a sense of just the high-level process, as well as some pointers on my favorite features of each tool.
Before all of that however, here is the end-result infographic [WARNING, SPOILERS below]:
Here is a direct link to the underlying data. For those of you who’re just into this for GoT catch-up purposes, gods’ speed in your catch-up efforts! I did my best to categorize each character’s arc as accurately as possible. So if you’re a super-fan with a more-correct understanding of the show’s progression, please by all means take the csv and correct/use for your own purposes!
For those interested in the how-to aspects of the tools involved, a bit more context before getting into the specifics.
Intended Audience and Other Approaches
As implied, RAWGraphs and Lucidcharts are extremely easy to use. The intended audience for the below are non-coders with spreadsheet/flat-file based reporting and presentation responsibilities. For the code-minded, there are numerous different packages, libraries, etc., that can produce alluvial diagrams. For my fellow R aficionados, I’d recommend the ggaluvial package. It’s built specifically within the principles of the tidyverse, and as its familiar two-character prefix implies, it works seamlessly with ggplot2. Additionally, code-based GoT data visualization projects are a tried-and-true genre. One of my favorites is 32 Game of Thrones Data Visualizations, by Jeffrey Lancaster. Definitely check it out if you’re interested in pursuing something along those lines.
Back to the non-coders though, I think you’ll find RAWGraphs extremely useful, given it’s positioning as “the missing link between spreadsheets and data visualization.” As long as you can structure data properly in a spreadsheet, you will surprise both your selves and your audiences alike with the kind of visualizations you’ll soon be able to belt out. Serving as a nice complement to RAWGraphs, Lucidcharts is great for process depiction and other diagramming needs. Essentially a web-based, slightly-more-open version of Microsoft Visio, it’s become a key part of my personal production process. As mentioned, both have their own great set of tutorials to get you started with your initial work:
Now let’s take a look at the specifics of the diagram that I made in RAWGraphs, as well as the summary text and other presentation aesthetics I used to round out the finished-product infographic in Lucidcharts.
Sourcing and Structuring the Data
Alluvial diagrams are named so because the flow bands resemble the naturally occurring alluvial fans that form from sediment deposits left by rivers and streams. They are similar to Sankey diagrams, that depict weighted system transfers and flows. The most famous example of a Sankey diagram is the Minard Map (which actually pre-dates the Sankey name and namesake by a few decades), depicting troop counts and movements during Napoleon’s disastrous Russian Campaign of 1812:
Like a Sankey, Alluvial diagrams consist of flows (i.e. the alluvium) between segmented blocks of nodes. That block segmentation can be determined by most any form of organization: spatial, date/time, process step, milestone, etc. Between each block, the flows of an Alluvial diagram show the changes in node composition between each block. Everything I wrote in the preceding few sentences will seem to be complete and utter gibberish if this is your first exposure to the subject, so, let’s just look at an extremely simplified example.
Assume a table where the first column consists of nondescript things. Then assume that, in some unnamed/undocumented/purpose-less process, those things switch between different nodes at different steps (blocks) in the process:
Thing# Step 1 Step 2 Step 3 Step 4
1 Node 1 Node 1 Node 2 Node 4
2 Node 1 Node 3 Node 2 Node 4
3 Node 3 Node 3 Node 3 Node 3
4 Node 2 Node 3 Node 2 Node 3
5 Node 3 Node 2 Node 2 Node 3
6 Node 4 Node 3 Node 2 Node 3
7 Node 4 Node 3 Node 2 Node 2
8 Node 4 Node 4 Node 3 Node 1
9 Node 3 Node 1 Node 2 Node 3
10 Node 3 Node 1 Node 2 Node 4
After copying/pasting the above into RAWGraphs input box, selecting “Alluvial Diagram”, and then finally dragging/dropping each step into place, you should arrive at the following:
The 4 vertical (disjointed) black lines are the blocks, and all of the nodes are labeled. The auto-colored alluvial streams connect each node, shifting in thickness to reflect node weight changes between block. Pretty awesome… albeit generic.
Enter the much more interesting and tangible material that is Game of Thrones. Recalling back to the section-header, we sort of had to cover the Structuring of the data first, but are now on to the Sourcing. Nothing fancy here. I simply went to GoT’s IMDb cast list, click-dragged the 740+ rows from Peter Dinklage on down to Noah Syndergaard (ha), and then copy/pasted it all into an Excel sheet (use Paste Special → Unicode unless you want to crash Excel with headshots). I also had to manually add rows for the dragons & direwolves, given their cgi-driven lack of IMDb credits. It took a bit of cleanup from there, which you can see here under the “Col_split” tab in one of my working files.
Once I had the names and episode counts parsed out, I added new columns for my blocks: “Origin”, “Starting Affiliation”, and then several more for “Affiliation at End of” each season. I used the term affiliation here because I thought allegiance is too strong a term for many of the best characters. Case-in-point:
Within each block, I assigned the node-values based on whatever leader, house, or other faction/status that I thought the character was primarily affiliated with at the end of each season. This included the status of being Deceased, an unsurprisingly common “affiliation.”
Consistent with la famiglia take on the show’s popularity, I used the “Origin” values to point back to the Major Houses that most of the characters came from. Across all blocks, I used generic “Other” categories (split by the show’s two fictional continents) to represent affiliation with smaller, more niche groups, or for those who’s status and whereabouts are straight-up unknown. Again, I assigned all of these based on my own viewing of the show, so I’m sure there are plenty of mis-interpretations and outright errors. Super-fans: if you find any that are particularly egregious, please don’t kill me (or worse).
Making the Alluvial diagram
Ultimately, I came up with a “clean” and complete table of some 480+ rows. I stopped short of categorizing all the characters because it became a bit time consuming, and otherwise I didn’t see much relevancy in the affiliations of “Brothel Child #2” or “King’s Landing Rioter” numbers #1- #3 (real characters, look ’em up). Once ready, I simply copy/pasted the table into RAWGraph’s input box and then went on to quickly configure the chart:
After adjusting all the width and height values, I made sure that the Sort by parameter was set to Size. As you can see above, I used the “Episodes” column because I thought that each character’s individual alluvial band should be proportional to his or her importance to the show. Also, the tool allows you to manually select colors for each alluvial band segment, but I left them alone because I think the auto-selections look pretty good.
While configuring all of this, I would occasionally notice errors that had to be fixed back in the original table. A bit annoyingly, I found that the chart, Steps, and Size selections reset after each new copy/paste. However, any inconvenience from that is more than made up for by how ridiculously easy RAWGraphs is to use in the first place.
Finishing the Infographic in Lucidcharts
The .png output from RAWGraphs is easy to load into Lucidcharts by either manual selection or drag-and-drop, and even retains its transparency setting (“Links Opacity” attribute). Once the diagram was in, it was super easy to build, shape, and arrange all of the other infographic elements around it. Within the Page Settings sidebar, you’re able to quickly fine-tune the grid, guide, and snap settings, minimizing the time you’d otherwise be spending extra clicks on “Align to”:
Lines and arrows are generally easy to create and manipulate as well. Hovering your mouse around the edge of any given figure reveals a red dot: clicking and pulling these will result in an arrow, or what ever your default line is (to set the default line, just adjust the line settings when no shapes are selected). For curved lines, the point-direction handles illuminate when you click it, allowing you to immediately change the curve shape without having to right-click or use any other additional keystroke to initiate edits.
Another nice feature is Lucidchart’s Upload Font capability. After finding a free approximation of GoT’s ultra-cool title font, it took just a few clicks to bring it in through the Font Manager:
While taking the above screenshot, I just realized that the author had even mapped the actual logo to “SHIFT + 3” (the one with the T-stroke covering the whole of “Thrones”). Mine in the title was just spelled out. Oh well, too late to change now.
Lastly, I’ve come to really appreciate Lucidchart’s “Download as” options:
Though not relevant here, I routinely use the “PNG with transparent background” option. Additionally, if you need back-compatibility or the option to send your works-in-progress via email, “Download As → Visio (VDX)” is very handy. After someone’s Visio edits, vdx files can be easily re-uploaded with the “Import → Import Visio” option on the main “Documents” page.
All-in-all, I had a lot of fun making this, and really enjoyed following the shifting flows of Game of Thrones character affiliations in the output diagram. As a fan of the show, some of them were really interesting to spot and follow along with [SPOILERS again, but if you’ve made it this far, come on…]:
- The degree to which each Major House comprised/contributed to the Night’s Watch from “Origin” to “Starting Affiliation”
- Jon Snow and (per my interpretation) The Mountain’s movements to-and-from the Deceased nodes.
- One of the dragons going over to the Night King.
- The Deceased category itself growing laughably dominant by the most recent season.
And there are probably many more that I’ve failed to note here. Regardless, I hope you enjoy the diagram as much as I did, and hope you find these RAWGraphs and Lucidchart tips useful as well. Both tools provide a ton of fun and easy-to-use charting capabilities that should be enormously useful in your day-to-day. Thank you so much for reading. I wish you good fortune in the diagramming efforts to come.