Representing a tree as an array of nodes and map of relationships

Today I had to find a way to implement a tree structure in d3.js. The task was to store a bunch of nodes in a database and be able to reconstruct their user-defined structure on-the-fly. To start, a user could insert a root node. Afterwards, the user could add children to the root and to any following node.

If I was going to put my academic hat on, I would’ve tackled this in the way I had learned in my Data Structures class: keep linked references to the children. Those children keep records of their children and so on.

But what if your nodes are global? As in, used across many different trees? You can’t maintain that information within a node because its properties might have to differ in different trees. In this case, you can keep track of the tree structure by having maps that associate a tree identifier and the identifiers of the children nodes:

node1.children[tree1] = node2, node3
node1.children[tree2] = node4, node5

However, this presents the problem of keeping too much data in a node. In a web application, where you want to retrieve only as much information from the server as absolutely necessary, this solution will not work.

The other alternative is to keep the edges as part of the tree data structure itself. This is coupled with keeping an array of nodes. Nodes contain only as much information as they need to site-wide and tree structures contain all the necessary data a user needs without any of the ones she doesn’t.

When constructing a tree in d3.js this way, my process goes along the lines:

for node in json.nodes:
d3.drawNode(node)
for relationship in json.relationships[1:]: //discount root node
if relationship.contains(node.id):
d3.drawEdge(node, relationship[node.id])

The code draws every node in an array and checks if it should draw an edge to any other node by checking the relationship object. The relationship object is structured something like:

relationship = {childID: parentID}

Every time a user adds a node, the data structure appends an edge between the child node and parent node. This is opposite of how trees normally work, but is necessary because you can’t draw an edge from a parent to a child that hasn’t been initialized yet.

This works only when you can make sure that any given child node comes after its parent in an array of nodes. Node indexes can be dispersed in terms of their distance from the root, but have to be kept in a strict sequential order in relation to their parents. Parent nodes must always come first.

When keeping that data structure in an array in a database, this isn’t difficult. It’s a simple “nodes.push(newNode)”. And this happens to be the way I’m implementing my current node and tree needs.