Making hierarchy layouts with D3.js

How to create hierarchical layouts from tabular data using d3-array utilities and d3-hierarchy

In order to make hierarchical layouts, as above, we need to create a specific hierarchical structure from our tabular data. This requires several data preprocessing steps. Particularly important is aggregating tabular data into groups by defined keys. So for starters, let’s see how d3.group works.

As an example, let’s use the following data from the d3-array docs:

D3 Group

First, let’s group data by name using d3.group

Function signature: d3.group(arrayData, groupingFn): Map

arrayData — any array groupingFn — function specifying which property should we use to group by

The function returns a Map object with our groupings.

The above line of code will produce the following Map object:

Multiple groupings are also available:

To nest additional group, we only need to add new function as argument for d3.group

Using Array.from allows converting fromMap to array with a customised payload:

D3 Rollup

The utility we will use to make a hierarchy for our data is d3.rollup. According to the d3-array docs:

d3.rollupgroups and reduces the specified iterable of values into a Map from key to value”.

It works similarly to d3.group but also allows reducing data to provide additional metrics we will need (like sums, means, etc.). Lets see how this function looks:

Function signature: d3.rollup(dataArray, reducerFn, …keysToGroupBy): Map

dataArray — any array, can be numbers but most likely it will be an array of objects, each object representing an observation. reducerFn — reducing function, a function used to create a value for each observation, for example, a sum or count. keysToGroupBy — a function or many functions to group by the data, works the same as in d3.group

The function returns a Map.

Simple example

Count of items in the first group we made above.

The output:

Rolling up data for hierarchy-ready data structure

(based on https://observablehq.com/@mbostock/2019-h-1b-employers)

For more real-life examples, we will use data from Mike Bostock’s observable notebook, which shows H-1B Employers in 2019.

The tabular data is an array with objects like the one below.

Let’s summarise what’s needed to make a hierarchy of nodes out of dataArray.

  1. Provide a reduce function (ie. summing values from particular data columns).
  2. Provide key function(s) to group by.
  3. Provide an accessor function expressing how child nodes are denoted.
  4. Call sum on root node so all parent nodes have aggregated values.
  5. Call sort on root node to arrange items.

Provide a reduce function

This will decide on the value property of every leaf node, and therefore the size of DOM elements representing these nodes.

Provide key function(s) to group by

This can be just one function directly passed as 3rd argument or array of functions passed with spread operator. When using multiple functions in an array, it is important to be mindful about the order of functions.

Now we can use the above functions to create rollupData:

Output snapshot:

Note how rollupData is structured: we have keys as per our groupByFns (State, City or Employer strings) and values that are either Map objects (for parent node elements) or numbers (for leaf elements), the number is a sum - result of our reduceFn function.

Making a hierarchy with d3.hierarchy

Finally, we can create a d3.hierarchy root node.

Let’s see how this function looks:

Function signature: d3.hierarchy(data, childrenAccessorFn): RootNode

data — hierarchical data as specified in the d3-hierarchy docs childrenAccessorFn — function pointing to an array of objects that should be treated as children of a node, d => d.children by default.

The function returns a d3-hierarchy root node containing the following props for each node: data, value, height, depth, parent, children

The data is our rollupData.

The only thing we need now is a childrenAccessorFn. In this function, we need to point to child nodes of a node in the hierarchy.

As we already noted, the rollupData item returns either aMap or a number. This is our value, we use it in childrenAccessorFn: if value has a size property it means it is a Map, then we return an array of its values, else the value is a number, which means we are at a leaf node with no child nodes available.

([key, value]) syntax is just handy decomposition of a Map node so we can directly refer to its props.

Let’s wrap up the whole thing:

As you can see, once we call d3.hierarchy with rollupData and childrenAccessorFn, the last steps to creating a hierarchy are calling .sum and .sort functions. Why do we call .sum here? This is different than summing in our reduceFn when using d3.rollup, .sum here allows parent nodes to have the sum of all leaf nodes' values. Leaf nodes' values were calculated with our summing reduceFn.

Now, having a hierarchy function will allow us to build any hierarchical layout provided by D3. For example, a circle pack layout, as seen in the image below, is created with only a few lines of code:

One function to rule them all

As a little bonus, we can create a utility function that will wrap all the steps needed to create hierarchy.

Let’s name it makeHierarchy. Functions like childrenAccessorFn, sumFn and sortFn are based on d3.rollup output structure so we can predefine them in default config. This way we only need to pass: arrayData, groupByFns and reduceFn to config.

and call it:

Thanks for reading! Liked it? Don’t forget to clap and subscribe!

This article was originally posted in observablehq.com: https://observablehq.com/@stopyransky/making-hierarchy-from-any-tabular-data

Credits for circle pack data visualization go to Mike Bostock observable: https://observablehq.com/@mbostock/2019-h-1b-employers

Codepen for D3 hierarchy:

Data Visualization Society

The publication for the Data Visualization Society, an initiative to foster community for data visualization professionals of all backgrounds.

Karol Stopyra

Written by

Web developer. #svg #canvas #webgl ❤ making digital tools to understand data around us.

Data Visualization Society

The publication for the Data Visualization Society, an initiative to foster community for data visualization professionals of all backgrounds.