Who is the most important member of the Targaryen family?

Huda Nassar
7 min readOct 10, 2022

--

Short answer: well… it depends. Slightly longer answer: keep reading :).

⚠️ this blogpost contains some spoilers ⚠️

If you are one of the ~44 million people who have watched Game of Thrones, or if you’ve recently started watching House of the Dragon, then you already know about the Targaryens. Well, chances are, even if you haven’t seen these shows, you probably already know about this family anyway 🙃.

So… without too much suspense, I’ll jump straight to the point. In this blogpost I want to answer one key question: “who is the most important Targaryen?”— maybe the next time someone asks you who your favorite Targaryen is, you can point them to this blogpost and possibly back up your answer.

I will use some common graph algorithms strategies (graph centrality in specific) to try and investigate this. You can also learn a lot about graph centrality throughout this post.

Enjoy!

The Targaryen family tree as a graph.

House Targaryen is known for many things, and one of them is that marriage is allowed between siblings, first cousins, and uncles/aunts with nephews/nieces. So the family tree is not really a tree, think of it as a graph.

I used the figure from here to build my own Targaryen graph.

House Targaryen (source)

Here’s how I built the graph:

I simplified things and only created the relations where two people are married or when one person is the parent of another person. The data is organized in two files and looks like this:

From the edges file, we know that “node 1” and “node 2” are married, and “node 1” is a parent of “node 5”. From the characters file, we know that “node 1” is King Aegon I Targaryen. The full data is available here.

This file has one missing relationship that is important to record, and that is the siblings relationship. But this is pretty easy to deduce in any programming language and especially easy to deduce in RelationalAI’s declarative language Rel. The Rel code to obtain the siblings relationship is pretty straightforward — you only need to write the logic:

def T(“sibling”, x, y) = exists(
p1, p2 : // exists parents p1 and p2 such that
T("married", p1, p2) and // p1 and p2 are married, and
T("child", p1, x) and T("child", p2, x) and // both parents of x
T("child", p1, y) and T("child", p2, y) and // both parents of y
x != y // and a child can't be a sibling of themselves.
)
// Note: T is the relation name, it has the set of all tuples
// of the form (relation_type, person1, person2)

We’re ready to define the graph:

There are multiple ways you can build this graph (for example: you can create a labeled graph or a hypergraph), but for now, we will keep things simple and define our nodes and edges as follows.

  • A node in this graph is a member of the Targaryen family from here.
  • An edge between two nodes in this graph means that the two nodes are part of the same immediate family, meaning they are either siblings, partners, or one is the parent of the other.

This formulation, is, again, easily achievable in Rel in the following way:

def G = person1, person2 : T(_, person1, person2)
// The `_` means that this value could be anything
// in our case it was either "married" or "child" or "sibling".

Applying centrality algorithms.

In very basic terms, the goal of centrality algorithms is to measure a node’s importance in a graph. If you use Twitter for example, and you follow some people and some people follow you, then you are a node in the Twitter graph and every follow represents an edge. I don’t have access to the full Twitter data, but I am willing to bet that people like Elon Musk, Barack Obama, Taylor Swift, and Oprah have some of the higher centrality scores (in multiple centrality measures) in the network.

There are many different kinds of centrality, and I am hoping to dissect each of them in future blogposts (sneak peek for the PageRank one, it’s coming up next). For now, I will give a very brief explanation of the five centrality measures we will use, and then, finally, answer the question I posed in the very beginning.

A brief summary of the five types of centrality algorithms we use.
  • Degree Centrality: this is one of the most basic centrality measures. A node’s importance is immediately correlated with how many connections this node has (known as a node’s degree). For example, on the Twitter network, recent numbers show that Barack Obama is the most followed person on Twitter, and thus his degree centrality measure is the highest (there is a subtlety here: because the Twitter graph is directed, this is really the in-degree centrality, but we will get to this in more detail when I discuss this type of centrality in the future — for now, the goal is to understand it intuitively).
  • Betweenness Centrality: this centrality is good at measuring how much a node happens to appear on paths between any two nodes in the graph. So for example, if we build a graph for train stations in the US, we will likely find that train stations in the middle of the country will have a higher betweenness centrality measure.
  • Eigenvector Centrality: an easy way to understand this type of centrality is to think of it as: “a node is as important as its neighboring nodes”. If you look at the equation in the above figure, you’ll notice that the eigenvector centrality of a node is a summation of the eigenvector centrality of all its neighbors (multiplied by the same scalar).
  • Katz Centrality: when you hear about eigenvector centrality, you often also hear about Katz centrality — and that’s because these two types of centralities are very similar mathematically. Here is what the Katz centrality means: a node’s importance is determined by its 1-hop neighbors’ importance, and its 2-hop neighbors’ importance, and so on. Though, this importance of k-hop nodes is weighted according to how close they are to the node of interest. For example, nodes at 10-hops away from the node of interest still contribute to the Katz centrality but not as much as nodes at a 1-hop nodes.
  • PageRank: since the PageRank centrality blogpost is coming up next, and since it is a little tricky to explain this type of centrality without diving into some mathematical formulations, I will postpone the full details for the next blogpost on PageRank and stick to the small definition provided in the figure above.

Okay, so who is the most important Targaryen.

Before I share the rest of the results, I want to share a personal gratification story.

For the past three months, I’ve been working on implementing these algorithms in Rel (RelationalAI’s declarative language) and for the past couple of days I got to run my own Rel implementations on the Targaryen graph… It was pretty gratifying to use Rel to do this. (Also, don’t worry, I’ve validated the results with Graphs.jl and MatrixNetworks.jl).

Anyway, finally, time for the results. Below, I share the top 10 characters in each of the centrality measures I used. I accompany them with the original figure I used, and I circle the top 10 characters. While the figure with the circles might be too small, the reason I provide it is to make a few observations clearer. So here are some observations:

The top-10 results of the five centrality measures I applied on the Targaryen graph. Click on the image to enlarge it and browse and investigate the results for yourself.
  1. Degree Centrality identified the people who had one degree of separation from many people, which can be useful in a social network setting but given the timeline implied on this family graph, these numbers do not indicate much as they did not occur at the same time.
  2. Betweenness Centrality: the way this figure is shown also shows a time aspect, so I wanted to draw attention to the fact that nodes with the highest betweenness centrality happened to be somewhere in the middle of this timeline. This is very similar to when we considered a transportation network in the US.
  3. Eigenvector Centrality identified the top 10 to be from the same family. In fact the only reason not all siblings are circled on the third line is because the last sibling’s rank was 11 (with the same eigenvector centrality value as their unmarried siblings). Recall that eigenvector centrality can be understood as “a node is as important as its neighboring nodes”. Interestingly, the identified family was the largest family (or clique) in our graph, and thus every node kept feeding its neighbor nodes “score”, but was also being fed from the same neighbors.
  4. Katz Centrality is almost identical to the eigenvector centrality with the only exception of two characters. But if we look at the next three spots in this centrality measure they all belong to the rest of the family started by King Jaehaerys I and Queen Alysanne.
  5. PageRank identified the highest number of “figure” characters, i.e. characters who have been highlighted in GoT or HotD.

So now… to answer the question, who is the most important character? While making big claims about House Targaryen is a tricky matter, I’d like to answer this question with PageRank’s results — notice that out of the four identified characters that were not kings or queens, three of them have appeared on one of the shows and had key roles to play.

Prince Rhaegar
Prince Aegon (Aegon the Unlikely King Aegon V)
Princess Alysanne (Queen Alysanne)
Prince Jaehaerys (King Jaehaerys I)
Prince Maekar (King Maekar)
Prince Daemon
Prince Aegon(King Aegon III)
Prince Viserys (King Viserys)
Princess Rhaenyra Targaryen
Prince Aemon

Thank you for reading! If you have more thoughts on how to analyze these results, I’d love to hear them — please share them below or on Twitter. Finally, I referenced future blogposts multiple times, so stay tuned, the PageRank blogpost is coming up next.

--

--

Huda Nassar

I like graphs, and often talk about them. I also often talk about Julia, data visualization, and what we’re building at RelationalAI. 🐦/ twitter.com/nassarhuda