Advanced Data Structures Part 1: Directed Acyclic Graph (DAG)

Hamza Surti
3 min readMar 31, 2016

--

**Work in progress**

I wanted to kick off this series with a data structure structure that we are all as developers intimately familiar with but may not even know it: Directed Acyclic Graphs. “I’ve never heard of that. You don’t know me!” you may say, but this graph is what makes version control possible. Git is an acyclic graph. In this post, I will give you a bit of high level knowledge on DAGs and then show you how to make one with some code.

What is a DAG?

So what does that even mean? A DAG is a graph that flows in one direction, where no element can be a child of itself. So most of us are familiar with LinkedLists, trees, and even graphs. A DAG is very similar to the first two, and an implementation of the third.

Similar to a tree but not quite the same

At the very minimum, a DAG will have 4 things:

  1. Nodes: A place to store the data.
  2. Directed Edges: Arrows that point in one direction (the thing that makes this data structure different)
  3. Some great ancestral node with no parents. (Fun fact: Most ancestry trees are actually DAGs and not actually trees because cousins at some point get married to each other.)
  4. Leaves: Nodes with no children

Let’s Make One

So let’s write some code. First, let’s create a constructor with two properties and call it DAG.

function DAG() {
this.nodes = [];
this.vertices = {};
}

Next we will add an add method. See what I did there?

DAG.prototype.add = function(node) {
if (!node) { return; }
if (this.vertices[node]) {
return this.vertices[node];
}
const vertex = {
node: node,
incoming: {},
incomingNodes: [],
hasOutgoing: false,
value: null
};
this.vertices[node] = vertex;
this.nodes.push(node);
return vertex;
};

So how does this work? The vertex object is where all the good stuff happens. We add a node, an object of incoming nodes, and an array with all of their names, a boolean on whether it points to something and a value. We will get to some of these a little later.

Now lets add some edges and make stuff connect to each other. Before we can do that, we have to create a helper function that checks whether we have visited that node or not. Let’s call it visit.

function visit(vertex, fn, visited, path) {
const name = vertex.name,
vertices = vertex.incoming,
names = vertex.incomingNames,
len = names.length,
i;
if (!visited) {
visited = {};
}
if (!path) {
path = [];
}
if (visited.hasOwnProperty(name)) {
return;
}
path.push(name);
visited[name] = true;
for (i = 0; i < len; i++) {
visit(vertices[names[i]], fn, visited, path);
}
fn(vertex, path);
path.pop();
}

What happens here is

Let’s Give Credit Where it is Really Due

Researching this article, I read some great posts by amazingly smart people and most of the information came from them. I learned most of the theory by reading this well written post on DAGs and version control. The code here got all of its inspiration from the emberjs source code and the brilliant authors behind it. I’ve also read numerous other articles and blog posts regarding DAGs from great people around the internet. Thank you for reading!

--

--