D3.js — Data Visualization with Javascript for beginners

liz lovero
The Data Experience
5 min readJul 16, 2015

--

The power of the unaided mind is highly overrated… The real powers come from devising external aids that enhance cognitive abilities. — Donald Norman

In many ways, D3.js (aka Data-Driven Documents — JavaScript) was a coding gateway drug for me. Last year, while still wholly unfamiliar with Javascript, I was able to use the library to make a interactive visualization with data from the Harry Ransom Center, where I was a graduate research assistant. D3.js produced beautiful results, gave me a great deal of control over design and display, and was frankly easier to make than a comparable diagram in Adobe Illustrator. I was hooked!

Turning code into something pretty was a big thrill. Using that code to make a point at work feels like super success! \\ ٩( ᐛ )و //

About D3.js

D3.js is a javascript library written in 2011 by Michael Bostock with Vadim Ogievetsky, and Jeffrey Heer as members of the Stanford Vis Group. (Serious Academic Paper here) Bostock went on to be the editor of interactive graphics at the NYTimes and hosts the D3.js source code on his Github.

The library can be used to make a huge range of interactive visualization types — really anything that you can imagine — and the dynamic effects have been repurposed for games and tarot card fortune telling.

The basics are as follows: (adapted from the docs)

  • D3 is a JavaScript Library embedded within an HTML webpage
  • D3 uses pre-built JavaScript functions to select elements, create SVG objects, style them, or add transitions, dynamic effects or tooltips to them.
  • These objects can also be widely styled using CSS.
  • Large datasets can be easily bound to SVG objects using simple D3.js functions to generate rich text/graphic charts and diagrams.
  • The data can be in various formats, most commonly JSON, comma-separated values (CSV) or tab-separated values (TSV).
  • For n00bs, there is a gallery on Github of example code that can be easily adapted to your purposes, including both html/js and sample data.

D3.js is also widely used by outlets like the Guardian and NY Times.

From the Guardian UK http://www.theguardian.com/world/interactive/2013/feb/12/state-of-the-union-reading-level

Using D3 as a coding novice

For this novice, the easiest plan of action was to select a template that would best represent my data. I went with the interactive heat map to compare the growth and development of collections at the two largest UT archives over time.

Fixing up the Dataset

I had an existing dataset of XML documents (encoded EAD finding aids — basically formulaic metadata describing the contents of an archival collection) from the two largest University of Texas Archvies — the Harry Ransom Center (where I was working) and the Dolph Briscoe Center for American History. A colleague helped scrape data from Texas Archival Resources Online (TARO), a public website of UT collections info.

Example of a single finding aid in EAD

Here’s what the source data looked like: first as a finding aid on TARO, then as an XML doc. I wanted to group the collections to get a better sense of how UT was approaching its collecting — specifically was the University just obtaining the papers of famous individuals, were they more focused on records from organizations, or were they producing a lot of thematic collections usually assembled by another collector and donated en masse.

XML file of a single finding aid.

The scraper used key words from the XML tags (aided by a fair bit of hard-coded data normalization on my part), to sort the collections into three groups to illustrate trends in collecting at the respective institutions in the last 15 years.

  • Records (collections belonging to organizations)
  • Papers (collections belonging to individuals or families)
  • Other (for thematic collections like Circus ephemera)

The final result was a three-column tsv file with the group, year, and value (or number of collections).

Putting it all together

I used the Interactive HeatMap (which based on this example I found from Trulia) to display my TSV data.

This is the basic import of data and layout code.

I adapted the gridsize and legend element to fit my data, as well as selected some colors to encapsulate the range. Since I had a pretty diverse set of data (values from 1 or 0 to several hundred) I played around quite a bit with the scale of the data, as seen below.

So, now feast your eyes on the final results!

For more on D3 and dataviz generally

I used DashingD3.js to learn all the basics of manipulating the data and elements of SVG necessary to build my first viz. I can’t recommend it more strongly to the newbie.

The best place to learn basic D3.js is from the online tutorial DashingD3.js.

Also, Mike Bostock has adapted his really insightful talk from Eyeo 2014 about the nature of visualization and algorithms into a blogpost or this video. He rocks a serious manbun while making sharp insights.

--

--