Making hierarchy layouts with D3.js
How to create hierarchical layouts from tabular data using d3-array utilities and d3-hierarchy
In order to make hierarchical layouts, as above, we need to create a specific hierarchical structure from our tabular data. This requires several data preprocessing steps. Particularly important is aggregating tabular data into groups by defined keys. So for starters, let’s see how d3.group
works.
As an example, let’s use the following data from the d3-array
docs:
const data = [
{name: "jim", amount: "34.0", date: "11/12/2015"},
{name: "carl", amount: "120.11", date: "11/12/2015"},
{name: "carl", amount: "10.31", date: "11/12/2015"},
{name: "stacy", amount: "12.01", date: "01/04/2016"},
{name: "stacy", amount: "1233.01", date: "01/04/2016"},
{name: "stacy", amount: "34.05", date: "01/04/2016"}
];
D3 Group
First, let’s group data by name using d3.group
Function signature:d3.group(arrayData, groupingFn): Map
arrayData
— any arraygroupingFn
— function specifying which property should we use to group by
The function returns a Map object with our groupings.
groupByName = d3.group(data, d => d.name)
The above line of code will produce the following Map object:
Multiple groupings are also available:
To nest additional group, we only need to add new function as argument for d3.group
groupByNameThenByDate = d3.group(data, d => d.name, d => d.date)
Using Array.from
allows converting fromMap
to array with a customised payload:
fistAmountPerEachName = Array.from(
groupByName,
([key, value]) => ({ key, value: value[0].amount })
)
D3 Rollup
The utility we will use to make a hierarchy for our data is d3.rollup
. According to the d3-array
docs:
d3.rollup “groups and reduces the specified iterable of values into a Map from key to value”.
It works similarly to d3.group
but also allows reducing data to provide additional metrics we will need (like sums, means, etc.). Lets see how this function looks:
Function signature:d3.rollup(dataArray, reducerFn, …keysToGroupBy): Map
dataArray
— any array, can be numbers but most likely it will be an array of objects, each object representing an observation.reducerFn
— reducing function, a function used to create a value for each observation, for example, a sum or count.keysToGroupBy
— a function or many functions to group by the data, works the same as in d3.group
The function returns a Map.
Simple example
Count of items in the first group we made above.
numberOfItemsPerEachName = d3.rollup(
data,
d => d.length, // reducerFn
d => d.name // keyToGroupBy
)
The output:
Rolling up data for hierarchy-ready data structure
(based on https://observablehq.com/@mbostock/2019-h-1b-employers)
For more real-life examples, we will use data from Mike Bostock’s observable notebook, which shows H-1B Employers in 2019.
The tabular data is an array with objects like the one below.
Let’s summarise what’s needed to make a hierarchy of nodes out of dataArray.
- Provide a reduce function (ie. summing values from particular data columns).
- Provide key function(s) to group by.
- Provide an accessor function expressing how child nodes are denoted.
- Call sum on root node so all parent nodes have aggregated values.
- Call sort on root node to arrange items.
Provide a reduce function
This will decide on the value
property of every leaf node, and therefore the size of DOM elements representing these nodes.
reduceFn = data => d3.sum(data, d => d[“Initial Approvals”] + d[“Initial Denials”] + d[“Continuing Approvals”] + d[“Continuing Denials”]);
Provide key function(s) to group by
This can be just one function directly passed as 3rd argument or array of functions passed with spread operator. When using multiple functions in an array, it is important to be mindful about the order of functions.
groupingFns = [d => d.State, d => d.City, d => d.Employer]
Now we can use the above functions to create rollupData:
rollupData = d3.rollup(dataArray, reduceFn, …groupingFns);
Output snapshot:
Note how rollupData
is structured: we have keys
as per our groupByFns
(State, City or Employer strings) and values
that are either Map objects (for parent node elements) or numbers (for leaf elements), the number is a sum - result of our reduceFn
function.
Making a hierarchy with d3.hierarchy
Finally, we can create a d3.hierarchy
root node.
Let’s see how this function looks:
Function signature:d3.hierarchy(data, childrenAccessorFn): RootNode
data
— hierarchical data as specified in the d3-hierarchy
docschildrenAccessorFn
— function pointing to an array of objects that should be treated as children of a node, d => d.children
by default.
The function returns a d3-hierarchy
root node containing the following props for each node: data
, value
, height
, depth
, parent
, children
The data is our rollupData
.
The only thing we need now is a childrenAccessorFn. In this function, we need to point to child nodes of a node in the hierarchy.
As we already noted, the rollupData
item returns either aMap
or a number. This is our value, we use it in childrenAccessorFn
: if value
has a size
property it means it is a Map, then we return an array of its values, else the value is a number, which means we are at a leaf node with no child nodes available.
childrenAccessorFn = ([ key, value ]) => value.size && Array.from(value)
([key, value])
syntax is just handy decomposition of a Map node so we can directly refer to its props.
Let’s wrap up the whole thing:
hierarchyData = d3.hierarchy([null, rollupData], childrenAccessorFn)
.sum(([key, value]) => value)
.sort((a, b) => b.value — a.value)
As you can see, once we call d3.hierarchy
with rollupData
and childrenAccessorFn
, the last steps to creating a hierarchy are calling .sum
and .sort
functions. Why do we call .sum
here? This is different than summing in our reduceFn
when using d3.rollup
, .sum
here allows parent nodes to have the sum of all leaf nodes' values. Leaf nodes' values were calculated with our summing reduceFn
.
Now, having a hierarchy function will allow us to build any hierarchical layout provided by D3. For example, a circle pack layout, as seen in the image below, is created with only a few lines of code:
pack = () => d3.pack()
.size([width, height])
.padding(1)
(hierarchyData)
One function to rule them all
As a little bonus, we can create a utility function that will wrap all the steps needed to create hierarchy.
Let’s name it makeHierarchy
. Functions like childrenAccessorFn
, sumFn
and sortFn
are based on d3.rollup
output structure so we can predefine them in default config. This way we only need to pass: arrayData
, groupByFns
and reduceFn
to config.
function makeHierarchy(config) { const defaultConfig = {
childrenAccessorFn: ([key, value]) => value.size && Array.from(value),
sumFn: ([key, value]) => value,
sortFn: (a, b) => b.value — a.value,
}; const {
data,
reduceFn,
groupByFns,
childrenAccessorFn,
sumFn,
sortFn
} = { …defaultConfig, …config }; const rollupData = d3.rollup(data, reduceFn, …groupByFns); const hierarchyData = d3.hierarchy([null, rollupData], childrenAccessorFn)
.sum(sumFn)
.sort(sortFn); return hierarchyData;
}
and call it:
makeHierarchy({
data: dataArray,
groupByFns: [d => d.State, d => d.City, d => d.Employer],
reduceFn: v => d3.sum(v, d => d[“Initial Approvals”] + d[“Initial Denials”] + d[“Continuing Approvals”] + d[“Continuing Denials”])
});
Thanks for reading! Liked it? Don’t forget to clap and subscribe!
This article was originally posted in observablehq.com: https://observablehq.com/@stopyransky/making-hierarchy-from-any-tabular-data
Credits for circle pack data visualization go to Mike Bostock observable: https://observablehq.com/@mbostock/2019-h-1b-employers
Codepen for D3 hierarchy: